Browsed by
Author: yreece

chugging along, learning sql

chugging along, learning sql

Learning SQL and databases has been fun so far. It is opening me up to new concepts, and I feel like I’m gaining a foundational understanding of information science. I find myself thinking about the information in other areas of my life and applying the concepts I have learned. For example, I made a database schema for my online English class.

As I go along through the ‘Design Databases with PostgreSQL’ course on CodeCademy, I have little moments of victory when I write a query correctly by myself. This was my most recent case of that:

SELECT book.title AS book_title, author.name AS author_name, book.description AS book_description
FROM book, author, books_authors
WHERE book.isbn = books_authors.book_isbn AND author.email = books_authors.author_email;

SELECT author.name AS author_name, author.email AS author_email, book.title AS book_title
FROM author, book, books_authors
WHERE author.email = books_authors.author_email AND book.isbn = books_authors.book_isbn;

These queries connect two tables together using a junction table. The first query shows for each book the corresponding authors. The second query shows for each author the corresponding books. This exercise is to demonstrate a many-to-many relationship.

Additionally, claude.ai pointed out that the same result can be reached using a different approach. This shows the same query using JOIN:

SELECT book.title AS book_title, 
       author.name AS author_name, 
       book.description AS book_description
FROM book
JOIN books_authors ON book.isbn = books_authors.book_isbn
JOIN author ON author.email = books_authors.author_email;
Exploring a baseball database

Exploring a baseball database

As part of the CodeCademy course ‘Design Databases With PostgreSQL’, I am getting oriented with databases by exploring this baseball database created by Sean Lahman.

I opted to use psql and the command line interface for this project, as opposed to the GUI client.

The project from CodeCademy was to do various analyses on the data by creating sql queries. I found that it was pretty much above my sql skill level to create the queries on my own, so I ended up re-typing each query from the solutions file. I was able to read and understand the logic, though for joins involving multiple conditions the logic was difficult to wrap my head around. Also, I’m not super familiar with baseball.

After completing the assignment, I asked Claude.ai to create a SQL query to show the trend in average salaries for baseball players. I used the output to make this chart in Google sheets.

Fastest route analysis using QGIS and Google maps

Fastest route analysis using QGIS and Google maps

Since I am preparing to move again, I need to start getting rid of the stuff that I’m not going to take. One of the main things is the clothing that I’ve accumulated over the last year. I will take it to the second-hand shops and see if I can sell it. I expect that I will need to visit multiple shops in order to get rid of all of it. My objective is to find the shortest route to hit the nearest several 2hand shops.

I opened Google Earth Pro. I typed ‘Thai Nguyen’ to navigate to my city. I then typed ‘2hand’, which is the local name for second-hand clothing stores. The results appeared on the map. I clicked the ‘copy to clipboard as kml’ button, and pasted onto a blank notepad, saving that as 2hand.kml.

I opened a new project in QGIS and added the 2hand.kml layer. The points appeared on the canvas. I also added a Google road basemap.

I created an isochrone, which shows as a polygon how far I can bicycle from home in 4 minutes, in every direction. I then found the fastest route to reach all the points contained in that polygon.

I exported the route as a KML file and imported that into Google maps. Now I can use the google maps app on my phone to view my route and navigate from point to point on my bicycle.

One note: Not all of the route seems to me like the fastest route. I’m sure the algorithm calculated what it was supposed to calculate, but somehow the result isn’t perfect. So, I’ll use the route as a guide, but depart from it slightly as needed. The fastest route analysis was still useful in showing the general order of stores that I should follow.

Setting up PostgreSQL server, client and command line tools

Setting up PostgreSQL server, client and command line tools

I downloaded postgresql and postbird. Now I can create and manage databases locally.

I practiced creating tables, adding columns and rows with various data types, backfilling data, querying the data and adding constraints. I practiced doing this using both the Postbird GUI and psql CLI.

postgres=# select * from films;
 id |           name            | release_date | category  | stars
----+---------------------------+--------------+-----------+-------
  1 | Bugs Life                 | 1998-11-14   | animation |     3
  2 | Matilda                   | 1996-07-28   | comedy    |     4
  3 | Return of the Jedi        | 1983-05-25   | sci-fi    |     2
  4 | Shrek                     | 2001-03-23   | animation |     5
  5 | The Count of Monte-Cristo | 2024-04-03   | action    |     3
  6 | Napoleon Dynamite         | 2004-04-19   | comedy    |     5
(6 rows)
A Video Transcription and Subtitling App

A Video Transcription and Subtitling App

This is a python application which I built. It transcribes video to text, allows the user to edit the transcript in the text editor, then saves and adds the subtitles back onto the video. It utilizes faster-whisper and ffmpeg. I built the core functionality by following a tutorial on digitalocean, then vibe-coded using Claude.ai to expand the functionality and add the tkinter graphical user interface.

CodeCademy course: Design Databases With PostgreSQL

CodeCademy course: Design Databases With PostgreSQL

This was the first project in a course that I am starting. In it, I learned some foundational database concepts and how to query, create, alter database information using SQL. Here are my first ever SQL queries:

CREATE TABLE friends (
  id INTEGER, 
  name TEXT, 
  birthday DATE
);

INSERT INTO friends (id, name, birthday) 
VALUES (1, 'Ororo Munroe', '1940-05-30');

INSERT INTO friends (id, name, birthday) 
VALUES (2, 'Didi', '2024-06-16');

INSERT INTO friends (id, name, birthday) 
VALUES (3, 'Luffy', '1999-05-05');

INSERT INTO friends (id, name, birthday) 
VALUES (4, 'water god', '1225-06-16');

UPDATE friends
SET name = 'Storm'
WHERE id = 1;

ALTER TABLE friends 
ADD email TEXT;

UPDATE friends
SET email = 'storm@lessonslides.com'
WHERE id = 1;

UPDATE friends
SET email = 'didi@lessonslides.com'
WHERE id = 2;

UPDATE friends
SET email = 'luffy@lessonslides.com'
WHERE id = 3;

UPDATE friends
SET email = 'watergod@lessonslides.com'
WHERE id = 4;

DELETE FROM friends
WHERE id = 1;

SELECT * 
FROM friends;

At the finish line

At the finish line

The project is almost complete! I used chatGPT to help me create a simple GUI with TKinter, and then used PyInstaller to bundle everything up into an executable (.exe) file. This self-contained executable can be stored on a flash drive and run on any computer without installing. I had read that portable apps can be slower-performing due to the speed of usb information transfer, however i don’t notice a difference after trying it.

The flash drive I bought for the purpose

The UX is just how I wanted. The user can just double click the .exe file to launch the GUI, select a video file and click “transcribe”, the .srt file automatically opens in notepad, the user proofreads and translates, then saves and closes, and clicks “add to video”. The subtitled mp4 video appears in the output folder.

I had some trouble with Windows Defender flagging the executable as malware. The name of the supposed malware is Trojan:Win32/Wacatac.B!ml. I did a little bit of reading, and it seems that this is commonly a false positive when developing apps in python. Still, I want to be able to distribute this file to my coworkers and for them to not worry.

I digitally signed the app by creating a self-signed certificate. I’ll include it with the .exe file when sharing, so users can double click the .cer to open a wizard and add it to their trusted store (thereby telling their system to trust the app and not give malware warnings). I also created a text document explaining and proving instructions about this.

I also submitted the file to Microsoft Security Intelligence for Malware Analysis. This will give me more assurance that the file isn’t malware. I’m still waiting for the results, which can take a few days.

Can now transcribe using faster whisper

Can now transcribe using faster whisper

I completed the tutorial showing how to use faster-whisper and ffmpeg to transcribe audio. I now have a Python script that will transcribe audio, generate a subtitle file, and add subtitles to the video with the click of a button.

currently, I run this tool using the CLI and a virtual Python environment. I am interested in creating a very simple GUI using tkinter. The GUI would allow the user to use the script without using the CLI.

I want them to be able to generate the subtitle file and add the subtitles to the video as two separate steps. This is so that they can proofread and edit the transcribed text before the subtitles are actually added to the video. The user would first click a “select file” button. Then, they would click “transcribe” to run the first part of the script (audio extraction, transcription and generate the subtitle file). When the .srt file is created, it would automatically be opened in a notepad (just the notepad app already on their computer). The user would proofread and edit this, then save it. Then the user would click an “add subtitles to video” button to run the add_subtitle_to_video part of the script.

I want this to be configured for bundling as a standalone .exe with all dependencies included, in order to be used as a portable app on a flash drive.

Troubleshooting all morning, but made progress

Troubleshooting all morning, but made progress

Following the digital ocean tutorial, I had a hang-up when I tried to install the faster-whisper package and couldn’t. It turned out that CTranslate2, which faster-whisper relies on, is not compatible with the Python version I was using. I then downloaded and installed Python version 11 and tried again and was successful. All that took 3 hours of troubleshooting. Well, now I know.

I have now created the project directory, downloaded the input video that will be used in the tutorial, set up a virtual environment, activated it, and installed the necessary Python packages.

Side Quest

Side Quest

The last few days, I have been going on a side-quest into a project related to my work at the English center. While it is not geospatial-related, I am attempting to build an app to solve a problem at work.

I have been using chatGPT to guide me through setting up what I need to install what I need. I installed a standalone Python, FFMPEG, Git (to download Whisper), and Visual Studio code (in order to set it as the default text editor for Git). I also downloaded the Python extension for Git.

I have also been using the command line, which is a first for me. I’ve learned how to edit system environment variables to add a new path so that the command line can find Python. I was able to download Whisper via Git using the command line.

Now that I have all the tools installed, I’m considering whether to try to use the script transcribe.py that ChatGPT wrote for me, or to follow the DigitalOcean tutorial. But for now I shall call it a day. Tomorrow.