This post was generated by gpt-3. My prompt was only the title and the first sentence
At Tiny we're scaling really quickly. Our platform allows tens of thousands of developers to instantly provision a running test environment in a few clicks. This creates a huge amount of data which is stored on a MongoDB cluster hosted on Google Compute Engine.
A few months ago we started noticing some serious performance problems. The database was not able to handle the query load. We had all of our database configuration on auto-pilot so it took us a while to realize we had a problem. After a few weeks of investigation it became obvious that MongoDB was not the right choice for us and we decided to make the switch to Postgres.
Why We Chose Postgres
Postgres has been around since the dawn of the web, and is more battle tested than any other database. It has very fast performance, is open source, and runs on a wide range of platforms.
MongoDB has very flexible schema and is a great choice for dynamic applications. When we initially started developing our service, the schema was very flexible. Since we allow users to provision an environment in a few clicks, the data model changed from one environment to the next. This was not a problem for MongoDB. With time we became more strict in our schema and users started having a more consistent data structure, but this was now against MongoDB's strengths and caused us a lot of performance problems.
Postgres on the other hand has a more rigid schema. There is no auto-sharding. The database servers have to be the right size. And the sharding is much more involved to setup. There are however many open source tools that can help you monitor, optimize and scale your Postgres cluster. There are also more resources online to help you migrate from one database to another.
Moving to Postgres was actually very easy. All we had to do was install a few packages.
apt-get install postgresql postgresql-contrib postgresql-client
We had to make a few changes to our code so that it would work with Postgres' strict schema.
We added a schema field to every model. This is important since Postgres cannot automatically detect the structure of the data.
class DataContainer < ActiveRecord::Base attribute :schema , :string attribute :data , :text , default: "" end
We have to explicitly define the length and format of each text field.
class DataContainer < ActiveRecord::Base attribute :data , :text , default: "" attribute :content , :text , default: "text" , size: 25 .chars end
Some Mongoid extensions won't work with Postgres. To convert these fields we used the ruby2ruby gem.
"Postgres extension for mongoid/mongo_mapper" .split( "::" ).each do |component| mongo_model = component.split( "::" ).first module Mongo end Mongoid::Document.extensions.each do |extension| if extension.start_with?( "Mongoid::Extension::" ) mongo_model_ext = extension.split( "::" ).first mongo_ext = module Mongoid::Document module Extension mongo_model_ext.split( "::" ).each do |component| mongo_ext.split( "::" ).each do |component| mongo_ext.split( "::" ).each do |component| mongo_ext.split( "::" ).each do |component| mongo_ext.split( "::" ).each do |component| if component.include?( '-is-extended-type' ) && component.split( "-" ).last == "ruby2ruby" include component.split( "-" ).last else puts "Ignoring #{component} in mongo_extension" end end end end end end end end end end end end end end end
Adding indexes to a Postgres database is also very easy. We use data containers which contain several documents. We needed to index them by schema and id. Here's how you can create a partial index.
rails g index add_to_container_data_schema_id
class AddToContainerIndex < ActiveRecord::Migration def up add_index :containers, :schema, :unique: true add_index :containers, :id, :unique: true end def down remove_index :containers, :id remove_index :containers, :schema end end
We also need to initialize the connection with the database, otherwise Rails will try to connect to the production database.
# config/database.yml production: adapter: postgresql encoding: unicode database: <%= ENV['TINY_APP_DB_NAME'] %> # db/structure.sql # <% db_name = ENV['TINY_APP_DB_NAME'] %> # db/seeds.rb class AddToContainerData < ActiveRecord::Migration def up execute <<-SQL CREATE TEMP TABLE containers ( id SERIAL PRIMARY KEY , schema VARCHAR( 100 ) NOT NULL , content TEXT NOT NULL , FOREIGN KEY (schema) REFERENCES schema_fields(schema) ); SQL execute <<-SQL CREATE UNIQUE INDEX containers_schema_id ON containers(schema); CREATE INDEX containers_id ON containers(id); SQL end def down execute <<-SQL DROP TEMP TABLE containers; SQL execute <<-SQL DROP INDEX containers_schema_id; DROP INDEX containers_id; SQL end end
Don't forget to add the migrations to the git repository.
$ git add db/migrate
$ git commit -m "Adding Postgres support to the containers"
$ git push
Running our tests on Postgres took a little bit more time than running on MongoDB. It's not a lot more though. I believe this is because our test environment has to process more requests. The performance on the production server was also better than Mongo.
The last step we took was to set up monitoring for our Postgres cluster. We set up simple scripts that runs inside a cron job to collect data and store it in a database.
We're looking for smart developers. Hackers apply here!