Programmatic Ponderings

transaction_id	date	time	product_id	product	calories	price	type	quantity	amount	payment_type
47a157f84e727fe3335db1519ee736a6	06/27/2022	19:59:42	22	Quiche	456	4.99	Food	2	9.98	Debit
1bf01013e699ca0f804650ea50826c82	11/20/2022	06:21:14	22	Quiche	456	4.99	Food	3	14.97	Cash
84f41c15749090d1e79bf9a48a58d6c3	08/18/2022	11:50:22	14	Chai Tea	200	3.5	Drink	2	7.0	Apple Pay
ef1845b8438bf3b5b99d2f4891a48f03	11/13/2022	17:20:51	12	Lemonade	120	3.0	Drink	2	6.0	Debit
9863de11be3099d6361392584e30e624	06/03/2022	18:27:03	18	Muffin	426	3.99	Food	2	7.98	Gift card
f50ed8878250bc06f66b97f5cd2f6df7	02/21/2022	17:02:18	7	Hot Chocolate	300	3.5	Drink	2	7.0	Credit
1903169473f41a0275ee702f2c6b1dd6	05/24/2022	14:58:25	10	Smoothie	200	4.0	Drink	3	12.0	Venmo
164a9519fd3db952e721e9f55dc1be74	01/07/2022	14:19:35	14	Chai Tea	200	3.5	Drink	2	7.0	Debit
dc85a202143de48ad4646190cdc0bf5c	01/28/2022	08:52:38	20	Wrap	388	5.99	Food	1	5.99	Venmo
7d182be793ab4b3290d14968e9c9a3e3	02/20/2022	10:16:10	16	Croissant	231	2.99	Food	2	5.98	Google Pay
f524811f456481f5a4a2f0c09dcafa28	10/03/2022	16:19:35	10	Smoothie	200	4.0	Drink	3	12.0	Debit
f2f63eacd33cb9f13849b08638e6fc3d	07/06/2022	16:14:20	10	Smoothie	200	4.0	Drink	3	12.0	Cash
7435e8cc8771a8b538529ff6f873cc05	04/12/2022	16:35:51	14	Chai Tea	200	3.5	Drink	3	10.5	Venmo
0e1154abb6a79b9ece570d51e3116846	11/28/2022	14:44:05	21	Salad	231	7.99	Food	2	15.98	Debit
d5926d58011a2a25798bf2aa72552cf0	04/19/2022	16:06:09	22	Quiche	456	4.99	Food	1	4.99	Venmo

id	address	city	state	zip	country	property_type	assessed_value
1	1008 Walk Burg	Houston	TX	77002	United States	Multi-family	1122321
2	7088 Second Square	Oklahoma City	OK	73102	United States	Single-family	261940
3	1425 Ridge Terrace	Indianapolis	IN	46204	United States	Single-family	1030391
4	982 Way Lane	New York	NY	10007	United States	Multi-family	95499
5	9404 Port Court	Columbus	OH	43215	United States	Single-family	922404
6	7135 Crossing Trail	Virginia Beach	VA	23451	United States	Single-family	272910
7	9481 Harbor Brook	New York	NY	10007	United States	Multi-family	232795
8	8585 Manor Branch	Raleigh	NC	27601	United States	Single-family	701217
9	7703 Bluff Boulevard	Las Vegas	NV	89101	United States	Single-family	530581
10	4186 Worth Circle	New York	NY	10007	United States	Townhouse	626577
11	8739 Prairie Trail	Portland	OR	97201	United States	Single-family	316167
12	4 Shores Road	Virginia Beach	VA	23451	United States	Single-family	925711
13	3148 Rocks Road	Mesa	AZ	85201	United States	Single-family	182479
14	1545 Wells Drive	Chicago	IL	60602	United States	Multi-family	1000777
15	1699 Ash Brook	Seattle	WA	98101	United States	Single-family	605392

user_id	first_name	last_name	dob	gender	martital_status	race	religion
1	Thomas	Powell	1967-06-10	Male	Married	Black	Christian
2	Ward	Williams	1973-07-22	Male	Single	Asian	Christian
3	Martha	Watson	1975-02-28	Feamle	Single	Hispanic	Agnostic
4	Brenda	Bailey	1979-07-07	Feamle	Married	Black	Christian
5	Parker	Johnson	1955-07-14	Male	Married	White	Christian
6	Rebecca	Wilson	1972-05-27	Feamle	Married	White	Christian
7	Doris	Allen	1956-07-09	Feamle	Married	Multiracial	Christian
8	Rebecca	Sanchez	1965-09-16	Feamle	Single	White	Christian
9	Mary	Johnson	1971-04-04	Feamle	Single	White	Christian
10	Anderson	Roberts	1983-11-02	Male	Single	Hispanic	Christian
11	Robinson	Peterson	1974-10-25	Male	Single	White	Jewish
12	Lopez	Ross	1985-04-30	Male	Married	White	Christian
13	Flores	Reed	1977-09-01	Male	Married	Black	Agnostic
14	Martin	Phillips	1960-03-24	Male	Single	White	Christian
15	Carter	Torres	1983-07-31	Male	Married	White	Christian

film_id	title	release_year	film_language	rating	categories	actor_array	rental_duration	length_minutes	replacement_cost	rental_rate
389	Gunfighter Mussolini	2006	English	PG-13	{Sports}	{"Audrey Olivier","Judy Dean","Scarlett Damon","Russell Close"}	3	127	9.99	2.99
581	Minority Kiss	2006	English	G	{Music}	{"Vivien Basinger"}	4	59	16.99	0.99
598	Mosquito Armageddon	2006	English	G	{Sports}	{"Goldie Brody","Kirk Jovovich","Nick Stallone","Reese West"}	6	57	22.99	0.99
943	Villain Desperate	2006	English	PG-13	{Documentary}	{"Dustin Tautou","Cary Mcconaughey"}	4	76	27.99	4.99
490	Jumanji Blade	2006	English	G	{New}	{"Jennifer Davis","Bob Fawcett","Nick Stallone","Gary Phoenix","Mena Temple","Jim Mostel"}	4	121	13.99	2.99
243	Doors President	2006	English	NC-17	{Animation}	{"Karl Berry","Lucille Tracy","Natalie Hopkins","Christian Akroyd","Sylvester Dern","Gene Hopkins","Ed Mansfield","Kim Allen","Reese West"}	3	49	22.99	4.99
40	Army Flintstones	2006	English	R	{Documentary}	{"Ed Chase","Cary Mcconaughey","Mae Hoffman","Gene Willis","Penelope Cronyn","Matthew Carrey","Russell Close"}	4	148	22.99	0.99
317	Fireball Philadelphia	2006	English	PG	{Comedy}	{"Val Bolger","Jude Cruise","Adam Grant","James Pitt","Frances Tomei"}	4	148	25.99	0.99
17	Alone Trip	2006	English	R	{Music}	{"Ed Chase","Karl Berry","Uma Wood","Woody Jolie","Spencer Depp","Chris Depp","Laurence Bullock","Renee Ball"}	3	82	14.99	0.99
195	Crowds Telemark	2006	English	R	{Sci-Fi}	{"Matthew Johansson","Anne Cronyn","Jeff Silverstone","Matthew Carrey"}	3	112	16.99	4.99

id	date	patient	code	description	reasoncode	reasondescription
714fd61a-f9fd-43ff-87b9-3cc45a3f1e53	2014-01-09	33f33990-ae8b-4be8-938f-e47ad473abfe	185345009	Encounter for symptom	444814009	Viral sinusitis (disorder)
23e07532-8b96-4d05-b14e-d4c5a5288ed2	2014-08-18	33f33990-ae8b-4be8-938f-e47ad473abfe	185349003	Outpatient Encounter
45044100-aaba-4209-8ad1-15383c76842d	2015-07-12	33f33990-ae8b-4be8-938f-e47ad473abfe	185345009	Encounter for symptom	36971009	Sinusitis (disorder)
ffdddbfb-35e8-4a74-a801-89e97feed2f3	2014-08-12	36d131ee-dd5b-4acb-acbe-19961c32c099	185345009	Encounter for symptom	444814009	Viral sinusitis (disorder)
352d1693-591a-4615-9b1b-f145648f49cc	2016-05-25	36d131ee-dd5b-4acb-acbe-19961c32c099	185349003	Outpatient Encounter
4620bd2f-8010-46a9-82ab-8f25eb621c37	2016-10-07	36d131ee-dd5b-4acb-acbe-19961c32c099	185345009	Encounter for symptom	195662009	Acute viral pharyngitis (disorder)
815494d8-2570-4918-a8de-fd4000d8100f	2010-08-02	660bec03-9e58-47f2-98b9-2f1c564f3838	698314001	Consultation for treatment
67ec5c2d-f41e-4538-adbe-8c06c71ddc35	2010-11-22	660bec03-9e58-47f2-98b9-2f1c564f3838	170258001	Outpatient Encounter
dbe481ce-b961-4f43-ac0a-07fa8cfa8bdd	2012-11-21	660bec03-9e58-47f2-98b9-2f1c564f3838	50849002	Emergency room admission
b5f1ab7e-5e67-4070-bcf0-52451eb20551	2013-12-04	660bec03-9e58-47f2-98b9-2f1c564f3838	185345009	Encounter for symptom	10509002	Acute bronchitis (disorder)

date	patient	encounter	code	description	value	units
2011-07-02	33f33990-ae8b-4be8-938f-e47ad473abfe	673daa98-67e9-4e80-be46-a0b547533653	8302-2	Body Height	175.76	cm
2011-07-02	33f33990-ae8b-4be8-938f-e47ad473abfe	673daa98-67e9-4e80-be46-a0b547533653	29463-7	Body Weight	56.51	kg
2011-07-02	33f33990-ae8b-4be8-938f-e47ad473abfe	673daa98-67e9-4e80-be46-a0b547533653	39156-5	Body Mass Index	18.29	kg/m2
2011-07-02	33f33990-ae8b-4be8-938f-e47ad473abfe	673daa98-67e9-4e80-be46-a0b547533653	8480-6	Systolic Blood Pressure	119.0	mmHg
2011-07-02	33f33990-ae8b-4be8-938f-e47ad473abfe	673daa98-67e9-4e80-be46-a0b547533653	8462-4	Diastolic Blood Pressure	77.0	mmHg
2012-06-17	33f33990-ae8b-4be8-938f-e47ad473abfe	be0aa510-645e-421b-ad21-8a1ab442ca48	8302-2	Body Height	177.25	cm
2012-06-17	33f33990-ae8b-4be8-938f-e47ad473abfe	be0aa510-645e-421b-ad21-8a1ab442ca48	29463-7	Body Weight	59.87	kg
2012-06-17	33f33990-ae8b-4be8-938f-e47ad473abfe	be0aa510-645e-421b-ad21-8a1ab442ca48	39156-5	Body Mass Index	19.05	kg/m2
2012-06-17	33f33990-ae8b-4be8-938f-e47ad473abfe	be0aa510-645e-421b-ad21-8a1ab442ca48	8480-6	Systolic Blood Pressure	113.0	mmHg
2012-03-26	36d131ee-dd5b-4acb-acbe-19961c32c099	296a1fd4-56de-451c-a5fe-b50f9a18472d	8302-2	Body Height	174.17	cm

start	stop	patient	encounter	code	description
2012-09-05	2012-10-16	bc33b032-8e41-4d16-bc7e-00b674b6b9f8	05a6ef43-d690-455e-ab2f-1ea19d902274	44465007	Sprain of ankle
2014-09-08	2014-09-28	bc33b032-8e41-4d16-bc7e-00b674b6b9f8	1cdcbe46-caaf-4b3f-b58c-9ca9ccb13013	283371005	Laceration of forearm
2014-11-28	2014-12-13	bc33b032-8e41-4d16-bc7e-00b674b6b9f8	b222e257-98da-4a1b-a46c-45d5ad01bbdc	195662009	Acute viral pharyngitis (disorder)
1980-01-09		01858c8d-f81c-4a95-ab4f-bd79fb62b284	ffbd4177-280a-4a08-a1af-9770a06b5146	40055000	Chronic sinusitis (disorder)
1989-06-25		01858c8d-f81c-4a95-ab4f-bd79fb62b284	ffbd4177-280a-4a08-a1af-9770a06b5146	201834006	Localized primary osteoarthritis of the hand
1996-01-07		01858c8d-f81c-4a95-ab4f-bd79fb62b284	ffbd4177-280a-4a08-a1af-9770a06b5146	196416002	Impacted molars
2016-02-07		01858c8d-f81c-4a95-ab4f-bd79fb62b284	748cda45-c267-46b2-b00d-3b405a44094e	15777000	Prediabetes
2016-04-27	2016-05-20	01858c8d-f81c-4a95-ab4f-bd79fb62b284	a64734f1-5b21-4a59-b2e8-ebfdb9058f8b	444814009	Viral sinusitis (disorder)
2014-02-06	2014-02-19	d32e9ad2-4ea1-4bb9-925d-c00fe85851ae	c64d3637-8922-4531-bba5-f3051ece6354	43878008	Streptococcal sore throat (disorder)
1982-05-18		08858d24-52f2-41dd-9fe9-cbf1f77b28b2	3fff3d52-a769-475f-b01b-12622f4fee17	368581000119106	Neuropathy due to type 2 diabetes mellitus (disorder)

	# Purpose: Generate coffee shop sales data
	# Author: Gary A. Stafford and GitHub Copilot
	# Date: 2023-04-12
	# Usage: python3 coffee_shop_data_gen.py 100
	# Command-line argument(s): rec_count (number of records to generate as an integer)

	# Write a program that creates synthetic sales data for a coffee shop.
	# The program should accept a command line argument that specifies the number of records to generate.
	# The program should write the sales data to a file called 'coffee_shop_sales_data.csv'.
	# The program should contain the following functions:
	# – main() function that calls the other functions
	# – function that returns one random product from a list of dictionaries
	# – function that returns a dictionary containing one sales record
	# – function that writes the sales records to a file

	import argparse
	import csv
	import hashlib
	import random
	from datetime import datetime, timedelta


	def main():
	# create a parser object
	parser = argparse.ArgumentParser(
	description="Generate coffee shop sales data")

	# add a command line argument to specify the number of records to generate
	parser.add_argument("num_recs",
	type=int,
	help="The number of records to generate",
	default=100)

	num_recs = parser.parse_args().num_recs

	write_data(num_recs)


	# Write a function to create list of dictionaries.
	# The list of dictionaries should contain 15 drink items and 10 food items sold in a coffee shop.
	# Include the product id, product name, calories, price, and type (Food or Drink).
	# Capilize the first letter of each product name.
	# Return a random item from the list of dictionaries.
	def get_product():
	products = [
	{"id": 1, "product": "Latte", "calories": 120, "price": 3.50, "type": "Drink"},
	{"id": 2, "product": "Cappuccino", "calories": 100, "price": 3.00, "type": "Drink"},
	{"id": 3, "product": "Americano", "calories": 5, "price": 2.50, "type": "Drink"},
	{"id": 4, "product": "Espresso", "calories": 10, "price": 2.00, "type": "Drink"},
	{"id": 5, "product": "Mocha", "calories": 250, "price": 4.00, "type": "Drink"},
	{"id": 6, "product": "Iced Coffee", "calories": 80, "price": 2.50, "type": "Drink"},
	{"id": 7, "product": "Hot Chocolate", "calories": 300, "price": 3.50, "type": "Drink"},
	{"id": 8, "product": "Tea", "calories": 0, "price": 2.00, "type": "Drink"},
	{"id": 9, "product": "Frappe", "calories": 450, "price": 5.00, "type": "Drink"},
	{"id": 10, "product": "Smoothie", "calories": 200, "price": 4.00, "type": "Drink"},
	{"id": 11, "product": "Iced Tea", "calories": 0, "price": 2.50, "type": "Drink"},
	{"id": 12, "product": "Lemonade", "calories": 120, "price": 3.00, "type": "Drink"},
	{"id": 13, "product": "Hot Tea", "calories": 0, "price": 2.00, "type": "Drink"},
	{"id": 14, "product": "Chai Tea", "calories": 200, "price": 3.50, "type": "Drink"},
	{"id": 15, "product": "Iced Chai", "calories": 250, "price": 4.00, "type": "Drink"},
	{"id": 16, "product": "Croissant", "calories": 231, "price": 2.99, "type": "Food"},
	{"id": 17, "product": "Bagel", "calories": 289, "price": 3.49, "type": "Food"},
	{"id": 18, "product": "Muffin", "calories": 426, "price": 3.99, "type": "Food"},
	{"id": 19, "product": "Sandwich", "calories": 512, "price": 6.99, "type": "Food"},
	{"id": 20, "product": "Wrap", "calories": 388, "price": 5.99, "type": "Food"},
	{"id": 21, "product": "Salad", "calories": 231, "price": 7.99, "type": "Food"},
	{"id": 22, "product": "Quiche", "calories": 456, "price": 4.99, "type": "Food"},
	{"id": 23, "product": "Scone", "calories": 335, "price": 2.49, "type": "Food"},
	{"id": 24, "product": "Pastry", "calories": 397, "price": 3.99, "type": "Food"},
	{"id": 25, "product": "Cake", "calories": 512, "price": 5.99, "type": "Food"},
	]

	# return one random item from list of dictionaries
	return random.choice(products)


	# Write a function to return a random sales record.
	# The record should be a dictionary with the following fields:
	# – transaction_id (a hash of the date, time, and product_id)
	# – date (a random date between 1/1/2022 and 12/31/2022)
	# – time (a random time between 6:00am and 9:00pm in 1 minute increments)
	# – product_id, product, calories, price, and type (from the get_product function)
	# – quantity (a random integer between 1 and 3)
	# – amount (price * quantity)
	# – payment type (Cash, Credit, Debit, Gift Card, Apple Pay, Google Pay, or Venmo)
	def get_sales_record():
	# get a random product
	product = get_product()

	# get a random date between 1/1/2022 and 12/31/2022
	start_date = datetime(2022, 1, 1)
	end_date = datetime(2022, 12, 31)
	random_date = start_date + timedelta(
	# Get a random number of seconds between 0 and the number of seconds between start_date and end_date
	seconds=random.randint(0, int(
	(end_date – start_date).total_seconds())), )

	# get a random time between 6:00am and 9:00pm
	start_time = datetime.strptime("6:00am", "%I:%M%p")
	end_time = datetime.strptime("9:00pm", "%I:%M%p")
	random_time = start_time + timedelta(
	# Get a random number of seconds between 0 and the number of seconds between start_time and end_time
	seconds=random.randint(0, int(
	(end_time – start_time).total_seconds())), )

	# get a random quantity between 1 and 3
	random_quantity = random.randint(1, 3)

	# get a random payment type:
	# Cash, Credit, Debit, Gift card, Apple Pay, Google Pay, Venmo
	random_payment_type = random.choice(
	["Cash", "Credit", "Debit", "Gift card", "Apple Pay", "Google Pay", "Venmo"])

	sales_record = {
	"date": random_date.strftime("%m/%d/%Y"),
	"time": random_time.strftime("%H:%M:%S"),
	"product_id": product["id"],
	"product": product["product"],
	"calories": product["calories"],
	"price": product["price"],
	"type": product["type"],
	"quantity": random_quantity,
	"amount": product["price"] * random_quantity,
	"payment_type": random_payment_type,
	}

	return sales_record


	# Write a function to write the sales records to a CSV file called 'coffee_shop_sales_data.csv'.
	# Use an input parameter to specify the number of records to write.
	# Call the get_sales_record function once for each record to write.
	# The CSV file must have a header row and be comma delimited.
	# All string values must be enclosed in double quotes.
	def write_data(rec_count):
	# open the file for writing
	with open("output/coffee_shop_sales_data.csv", "w", newline="") as csv_file:
	# create a csv writer object
	csv_writer = csv.writer(csv_file,
	delimiter=",",
	quotechar='"',
	quoting=csv.QUOTE_NONNUMERIC)

	# write the header row
	# id,date,time,product_id,product,calories,price,type,quantity,amount,payment_type
	csv_writer.writerow([
	"transaction_id",
	"date",
	"time",
	"product_id",
	"product",
	"calories",
	"price",
	"type",
	"quantity",
	"amount",
	"payment_type",
	])

	# write the sales records
	for i in range(rec_count):
	sale = get_sales_record()
	transaction_id = hashlib.md5((f'{sale["date"]} {sale["time"]} {sale["product_id"]}').encode()).hexdigest()
	csv_writer.writerow([
	transaction_id,
	sale["date"],
	sale["time"],
	sale["product_id"],
	sale["product"],
	sale["calories"],
	sale["price"],
	sale["type"],
	sale["quantity"],
	sale["amount"],
	sale["payment_type"],
	])


	if __name__ == "__main__":
	main()

	# set postgres environment variables
	# CHANGE ME
	export PGHOST="postgres1.abcxyzdef.us-east-1.rds.amazonaws.com"
	export PGPORT=5432
	export PGDATABASE="postgres"
	export PGUSER="admin"
	export PGPASSWORD="change_me!"

	# create new v1 of pagila database
	export PGDATABASE="postgres"
	psql -c "CREATE DATABASE pagila_v1;"

	# restore original version of pagila database
	pg_restore -d pagila_v1 pagila.dump

	# confirm pagila tables in public schema
	export PGDATABASE="pagila_v1"
	psql -c "\dt"

	# dump v1 of pagila database
	pg_dump -Fc -d pagila_v1 -f pagila_v1.dump

	# create new v2 of pagila database
	psql -c "CREATE DATABASE pagila_v2;"

	# restore v1 of pagila database
	pg_restore -d pagila_v2 pagila_v1.dump

	# connect to new pagila database
	export PGDATABASE="pagila_v2"
	psql

	— wrap in transaction
	BEGIN;

	— optional, should be set to public by default
	SET search_path TO public;

	— create new schemas
	CREATE SCHEMA common;
	CREATE SCHEMA customers;
	CREATE SCHEMA films;
	CREATE SCHEMA sales;
	CREATE SCHEMA staff;
	CREATE SCHEMA stores;

	— common
	ALTER TABLE address SET SCHEMA common;
	ALTER TABLE city SET SCHEMA common;
	ALTER TABLE country SET SCHEMA common;

	— customers
	ALTER TABLE customer SET SCHEMA customers;

	— films
	ALTER TABLE actor SET SCHEMA films;
	ALTER TABLE category SET SCHEMA films;
	ALTER TABLE film SET SCHEMA films;
	ALTER TABLE language SET SCHEMA films;
	ALTER TABLE film_actor SET SCHEMA films;
	ALTER TABLE film_category SET SCHEMA films;

	— sales
	ALTER TABLE payment SET SCHEMA sales;
	ALTER TABLE rental SET SCHEMA sales;

	— staff
	ALTER TABLE staff SET SCHEMA staff;

	— stores
	ALTER TABLE store SET SCHEMA stores;
	ALTER TABLE inventory SET SCHEMA stores;

	COMMIT;

	— confirm all tables are removed from public schema
	\dt

	CREATE OR REPLACE VIEW sales.sales_by_store AS
	SELECT (c.city \|\| ','::text) \|\| cy.country AS store,
	(m.first_name \|\| ' '::text) \|\| m.last_name AS manager,
	sum(p.amount) AS total_sales
	FROM sales.payment p
	JOIN sales.rental r ON p.rental_id = r.rental_id
	JOIN stores.inventory i ON r.inventory_id = i.inventory_id
	JOIN stores.store s ON i.store_id = s.store_id
	JOIN common.address a ON s.address_id = a.address_id
	JOIN common.city c ON a.city_id = c.city_id
	JOIN common.country cy ON c.country_id = cy.country_id
	JOIN staff.staff m ON s.manager_staff_id = m.staff_id
	GROUP BY cy.country, c.city, s.store_id,
	m.first_name, m.last_name
	ORDER BY cy.country, c.city;

	# Purpose: Generate US residential address data
	# Author: Gary A. Stafford and GitHub Copilot
	# Date: 2023-04-13
	# Usage: python3 residential_address_data_gen.py 100
	# Command-line argument(s): rec_count (number of records to generate as an integer)

	# Write an application that create a random list of united states addresses.
	# The application should accept a command line argument that specifies the number of records to generate.
	# Include address, city, state, zip code, country, and property type.
	# Write the data to a csv file named 'address_data.csv'.
	# The application should contain the following functions:
	# – main() function that calls the other functions
	# – function that returns a list of common street names in the United States
	# – function that returns a list of common street types in the United States
	# – function that returns a list of common city, state, zip code, and population in the United States
	# – function that returns a property type


	import csv
	import random
	import argparse

	cities_final = []


	def main():
	parser = argparse.ArgumentParser(description="Generate coffee shop sales data")
	parser.add_argument(
	"rec_count", type=int, help="The number of records to generate", default=100
	)

	# add population calculations to the city data
	cities = get_cities()
	prepare_cities(cities)

	rec_count = parser.parse_args().rec_count
	write_data(rec_count)


	# Write a function that creates a list of common street names

	# Purpose: Generate demographic data
	# Author: Gary A. Stafford and GitHub Copilot
	# Date: 2023-04-14
	# Usage: python3 demographic_data_gen.py 100
	# Command-line argument(s): rec_count (number of records to generate as an integer)

	# Write an application that creates a file containing demographic data.
	# The application should accept a command line argument that specifies the number of records to generate.
	# The application should write the demographic data to a file called 'demographic_data.csv'.
	# The application should contain the following functions:
	# – main() function that calls the other functions
	# – function that returns a random first name
	# – function that returns a random last name
	# – function that returns a random date of birth
	# – function that returns a random gender
	# – function that returns a random religious affiliation
	# – function that returns a random race


	import random
	import argparse
	import csv
	from datetime import date, timedelta


	def main():
	parser = argparse.ArgumentParser(description="Generate demographic data")
	parser.add_argument(
	"rec_count", type=int, help="The number of records to generate", default=100
	)

	rec_count = parser.parse_args().rec_count
	write_data(rec_count)


	# Write a function that generates a list of common feminine first names in the United States.
	# List should be in alphabetical order.
	# Each name should be unique.
	# Return random first name.
	def get_first_name_feminine():

	— wrap in transaction
	BEGIN;

	— create new customers.address table
	CREATE SEQUENCE IF NOT EXISTS customers.address_address_id_seq
	INCREMENT 1
	START 1
	MINVALUE 1
	MAXVALUE 9223372036854775807
	CACHE 1;

	ALTER SEQUENCE customers.address_address_id_seq
	OWNER TO pagila_admin;

	CREATE TABLE IF NOT EXISTS customers.address (
	address_id integer DEFAULT nextval('address_address_id_seq'::regclass) NOT NULL PRIMARY KEY,
	address text NOT NULL,
	address2 text,
	district text NOT NULL,
	city_id smallint NOT NULL REFERENCES common.city ON UPDATE CASCADE ON DELETE RESTRICT,
	postal_code text,
	phone text NOT NULL,
	last_update timestamp with time zone DEFAULT now() NOT NULL
	);

	ALTER TABLE customers.address
	OWNER TO pagila_admin;

	CREATE INDEX IF NOT EXISTS idx_fk_city_id ON customers.address(city_id);

	CREATE TRIGGER last_updated
	BEFORE UPDATE ON customers.address FOR EACH ROW
	EXECUTE PROCEDURE last_updated();

	COMMIT;

	— wrap in transaction
	BEGIN;

	— copy only customer addresses to new customers.address table
	INSERT INTO customers.address
	SELECT *
	FROM common.address
	WHERE common.address.address_id IN (
	SELECT DISTINCT address_id
	FROM customers.customer
	);

	— copy only staff addresses to new staff.address table
	INSERT INTO staff.address
	SELECT COUNT(*)
	FROM common.address
	WHERE common.address.address_id IN (
	SELECT DISTINCT address_id
	FROM staff.staff
	);

	— copy only store addresses to new stores.address table
	INSERT INTO stores.address
	SELECT *
	FROM common.address
	WHERE common.address.address_id IN (
	SELECT DISTINCT address_id
	FROM stores.store
	);

	— check for extraneous data in common.address before deleting
	SELECT *
	FROM common.address
	WHERE common.address.address_id NOT IN
	(SELECT DISTINCT address_id FROM customers.customer)
	AND common.address.address_id NOT IN
	(SELECT DISTINCT address_id FROM staff.staff)
	AND common.address.address_id NOT IN
	(SELECT DISTINCT address_id FROM stores.store);
	COMMIT;

Archive for category Technology Consulting

Introduction

Common Forms of Synthetic Data

Synthetic Tabular Data Types

Challenges with Creating Synthetic Data

Difficult Patterns and Behaviors to Model

Easily Modeled Patterns and Behaviors

Creating Synthetic Data with Generative AI

Using IDE-based Generative AI Tools

Source Code

Example #1: Coffee Shop Sales Data

Example #2: Residential Address Data

Example #3: Demographic Data

Generative AI Tools for Unit Testing

Conclusion

Terminology

Artificial General Intelligence (AGI)

Artificial Intelligence (AI)

ChatGPT

DALL·E

Deep Learning (DL)

Generative AI

Generative Pre-trained Transformer (GPT)

Intelligence Amplification

Large Language Model (LLM)

Machine Learning (ML)

Neural Network

Types of Neural Networks

OpenAI

Prompt Engineering

Reinforcement Learning with Human Feedback (RLHF)

Ready for More?

Introduction

Patterns

Multi-Account Advantages

AWS Control Tower

Common Multi-Account Patterns

Pattern 1: Single “Uber” Account

Pros

Cons

Pattern 2: Non-Prod/Prod Environments

Pros

Cons

Pattern 3: Upper/Lower Environments

Pros

Cons

Pattern 4: SDLC Environments

Pros

Cons

Pattern 5: Major Workload Separation

Pros

Cons

Pattern 6: Backup

Pros

Cons

Pattern 7: Sandboxes

Pros

Cons

Pattern 8: Centralized Management and Governance

Pros

Cons

Pattern 9: Internal/External Environments

Pros

Cons

Pattern 10: PCI DSS Workloads

Pros

Cons

Pattern 11: Vendors and Contractors

Pros

Cons

Pattern 12: Mergers and Acquisitions

Pros

Cons

Multi-Account AWS Environment Example

Conclusion

Recommended References

Introduction

Terminology

Monolithic Architecture

Microservices Architecture