
My homework this week was to create a database schema, add some data from one of four sources, connect it to an AWS RDS instance using MySQL, and use the monitoring tools within AWS RDS to set up a CloudWatch dashboard to monitor the database. Since I had some issues with my computer this week, I decided to make it easier on myself and just use PGAdmin (4) and an AWS RDS instance of a PostGreSQL (9.6) database. While it was fairly simple to download PGAdmin and set up a new AWS RDS PostGreSQL instance, I ran into an issue connecting the two. I originally did not set up the my PostGreSQL instance to allow for public access (along with a security role) and while I entered the correct endpoint and other data in PGAdmin, it would not accept connections. Thankfully, the developer docs had a trouble shooting area and once I figured out that this was my issue, I was able to connect it. At least I know that the no-public access feature works!
For the assignment, I chose the cereal dataset – although the data didn’t contain my favorite cereal… Since we only needed to create one table, I just used the fields from the csv file as the columns in my table. The csv file contained datatypes in the second row of the data, but the data types didn’t correspond to datatypes in PostGreSQL. However, it was easy enough to figure out what would work, e.g. instead of a float, I used double precision.
I was able to import all of the CSV data into the database using PGAdmin, although I had a few issues initially. The names of some of the cereals had commas, which threw an error message and prevented the import. In retrospect, I should have cleaned the data first by writing a short script to insert an escape character before any commas in the cereal column. However, I used a brute force method that wouldn’t work in real life: I deleted the commas from name column. I was then able to import the data without issue.
The CloudWatch dashboard that I created had four widgets: Freeable Memory, Read Latency, Write Latency and Database Connections. While I don’t have a lot of data in this database, at least now I know how to set up monitoring and make changes when there is not a lot at stake.