Dataframes and Lists / Data Types / Docker Builds / PSQL

6 min readMar 30, 2024

The code goes back and forth between dataframes and lists. This must be ineffecient.

There are multiple lists. Elements at the same index in each list are a set.

If I remove an element from a list, the index of every subsequent element changes.

Current solution: Make every list a column in a dataframe. Remove a row in a dataframe. Then the removal is concurrent.

Why not just stick to dataframes? I do some math manipulation on all elements in a list. This might be faster than operating on every element in a column in a dataframe. I haven’t checked.

Scaling up with VMs, docker containers inside VMs, and VMs managed by an orchestration software.

After creating a dockerfile, use docker build /path/to/dockerfile

Getting rid of the comment in the EXPOSE line resolved the error, i.e. syntax error. The docker file was used to build an image. The image size 563 Mb.

Another image was used to launch a container interactively in the shell, use flat -it for ‘interactive terminal’ and specify that /bin/bash is the command to be executed. A python script successfully runs inside the docker container, demo’d by printing hello.

Since nginx was installed, the container has a default dir structure: /var/www/html and an index.html file at this location.

at /bin ls shows the following psql files

The user installed programs are not /bin , instead they are at /usr/lib

If modify config file, restart the service for changes to take effect:

service postgresql restart

Authentication of a user can be via md5, via peer, or via scram-sha-256.

Switch to a user called postgres. Now we are in a VM, and then in a docker container, and able to access a db.

su — postgres [there is a space b/w the su and the dash and postgres, medium reformats]

When the docker container is started using an image, the postgresql server is installed because it was in the dockerfile. Must start the service:

service postgresql start

Entering and exiting the psql command line has two relevant commands: psql and \q . See the change in the command line.

Created 3 tables: bikes, planets; customers.

When writing sql queries to join tables, the order of tables specified by ON does not affect the result.

When adding a new column, it is necessary to declare the name and the data type for the values in the column.

Join more than 2 tables, as many as needed, by using multiple joins. The query below has 2 joins and aggregates data from 3 tables: customers, bikes, planets based on aligning the column ‘name’ from each of those tables. Resulting table has some redundancy, such as the column ‘name’ is repeated 3 times.

The results of a query can be saved as a new table or inserted into another table. Three Rows from ‘bikes’ are inserted into ‘cheap_bikes’ based on a price condition. Maya, Jill, and Jen have low price bikes.

SQL query result → adds on rows to another table that was already in the DB

The Cheap_Bikes table already had some information, such as about Maya’s bike, and now there is a repetition of data.

Types supported by PSQL:

1. Numeric Types

Integer Types:
SMALLINT: 2-byte integer, range: -32,768 to +32,767
INTEGER or INT: 4-byte integer, range: -2,147,483,648 to +2,147,483,647
BIGINT: 8-byte integer, range: -9,223,372,036,854,775,808 to +9,223,372,036,854,775,807
Serial Types (Auto-increment):
SMALLSERIAL: 2-byte auto-incrementing integer
SERIAL: 4-byte auto-incrementing integer
BIGSERIAL: 8-byte auto-incrementing integer
Decimal Types:
NUMERIC(precision, scale): User-specified precision, exact decimal values, useful for monetary amounts.
DECIMAL(precision, scale): Synonym for NUMERIC.
Floating-Point Types:
REAL: 4-byte floating-point number.
DOUBLE PRECISION: 8-byte floating-point number.

2. Monetary Types

MONEY: Represents currency values with fixed fractional precision, formatted according to locale.

3. Character Types

CHAR(n): Fixed-length character, padded with spaces if necessary.
VARCHAR(n): Variable-length character with a specified limit.
TEXT: Variable-length character with no limit, useful for large text blocks.

4. Binary Data Types

BYTEA: Stores binary data, useful for files, images, or other binary content.

5. Date and Time Types

DATE: Calendar date (year, month, day).
TIME [WITHOUT TIME ZONE]: Time of day without time zone.
TIME WITH TIME ZONE: Time of day with time zone.
TIMESTAMP [WITHOUT TIME ZONE]: Date and time without time zone.
TIMESTAMP WITH TIME ZONE (alias TIMESTAMPTZ): Date and time with time zone.
INTERVAL: Represents a span of time (e.g., days, hours, minutes).

6. Boolean Type

BOOLEAN: Stores TRUE, FALSE, or NULL.

7. Enumerated Types

ENUM: User-defined list of allowed values (e.g., ENUM('small', 'medium', 'large')).

8. Geometric Types

POINT: Stores a geometric point (x, y).
LINE: Stores a geometric line.
LSEG: Stores a line segment.
BOX: Stores a rectangular box.
PATH: Stores a path (open or closed).
POLYGON: Stores a closed polygon.
CIRCLE: Stores a circle.

9. Network Address Types

Network Types:
CIDR: IPv4 or IPv6 network address.
INET: IPv4 or IPv6 host address.
MACADDR: MAC address.

10. UUID Type

UUID: Stores universally unique identifiers (UUIDs), often used as primary keys.

11. JSON Types

JSON: Stores JSON data in plain text format.
JSONB: Stores JSON data in a binary format, allowing faster access and indexing.

12. Array Types

Supports arrays of any data type (e.g., INTEGER[], TEXT[]), useful for storing multiple values in a single column.

13. Range Types

INT4RANGE: Range of integer values.
NUMRANGE: Range of numeric values.
TSRANGE: Range of timestamps (without time zone).
TSTZRANGE: Range of timestamps (with time zone).
DATERANGE: Range of dates.

14. Composite Types

Allows defining custom data types that group multiple fields (similar to a struct in other languages).

15. Special Types

TSVECTOR: Used for full-text search.
TSQUERY: Used to define a text search query.
XML: Stores XML data.
VOID: Represents a void return type, primarily used for functions.

Dataframes and Lists / Data Types / Docker Builds / PSQL

service postgresql restart

1. Numeric Types

2. Monetary Types

3. Character Types

4. Binary Data Types

5. Date and Time Types

6. Boolean Type

7. Enumerated Types

8. Geometric Types

9. Network Address Types

10. UUID Type

11. JSON Types

12. Array Types

13. Range Types

14. Composite Types

15. Special Types

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Anudha Mittal

No responses yet

More from Anudha Mittal

Models that modify Probability Distributions, I/O for Models via Web Protcols

LLMs are models that modify the probability distribution of a set of events, and predict the next event.

Tokenization In NLP / PSQL / Merging data

Initializes a vocabulary list based on every character

LLM Written Notes on Margins for Software Salary

Understanding Software Product Margins & Revenue Contribution

Deploying Web Applications & Monitoring Web Apps

pip install watchdog

Recommended from Medium

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Tools I use that are cheaper than Netflix

How I Am Using a Lifetime 100% Free Server

Get a server with 24 GB RAM + 4 CPU + 200 GB Storage + Always Free

Lists

Staff picks

Stories to Help You Level-Up at Work

Self-Improvement 101

Productivity 101

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jeff Bezos’s morning routine has long included the one-hour rule. New neuroscience says yours probably should too.

Goodbye Obsidian

10 MindBlowing Free APIs to Supercharge Your Next Project

Make your projects 10x better!!

How I Learned to Love `__init__.py`: A Simple Guide😊

💡 Heads Up! Click here to unlock this article for free if you’re not a Medium member!

How I Learned to Love `init.py`: A Simple Guide😊