Migrating a CouchDB database with Joyent & Stud
by YLD • March 26th, 2014 • 6min
tl;dr
Step by step how to install couchdb in ubuntu. But really, you should use iriscouch for your production couchdb needs. If you need help don’t forget to go to #couchdb on irc.freenode.org, these guys are incredibly helpful
If you love node.js don’t forget to give nodejitsu a try too!
Intro
This week i had to migrate my first production database to a new environment. this documents the process in the hope that others find it useful
In my case i was updating a old couchdb mostly for two reasons:
- This particular version of couchdb had a bug in handling ssl
- Couchdb versions prior to 1.2.0 didn’t automatically resume replication after restart or auto compaction
In this tutorial you will find information on how to upgrade your couchdb, keep production running and safely “switch needles” after your new environment is tested and in production
Node.js
In this tutorial i’m going to use a lot of node.js tools. if you don’t have it installed you can do:
mkdir /opt/install
cd /opt/install
wget http://nodejs.org/dist/node-latest.tar.gz
tar xvf node-latest.tar.gz
cd node-v\*/
./configure
make
make install
#
\# Install some cool tools i use all the time
\# and might be referenced in this article
#
npm install -g jsontool nave ghcopy nd futon cdir
Joyent
I’m a big fan of joyent so decided to use them in this tutorial. however i decided not to use smartos since at the time of this writting the support for openssl 1.0.1 does not exist
To use joyent you first need to download and install the smartdc client from npm:
npm install -g smartdc
sdc-setup
This should have installed smartdc and configured it with your help. however if you need some more pointers please refer to the official smartdc documentation
If you are also intending to create a hot standby replica of your production system you will want to follow these steps but place them in different data centers. you can see the list of available data centers by doing:
sdc-listdatacenters \\
-u https://us-east-1.api.joyentcloud.com \\
-a username \\
-k keyname
This assumes your username is username and you wish to authenticate using the key information you store in joyent at set up time as keyname
Here is what the response to this request currently looks like:
{
"us-east-1": "https://us-east-1.api.joyentcloud.com",
"us-west-1": "https://us-west-1.api.joyentcloud.com",
"us-sw-1": "https://us-sw-1.api.joyentcloud.com",
"eu-ams-1": "https://eu-ams-1.api.joyentcloud.com"
}
In this tutorial i’m going to use the https://us-east-1.api.joyentcloud.com data center for the production couchdb and https://eu-ams-1.api.joyentcloud.com for the hot standby replica
We now need to select the operative system we are going to install as well as the size of our virtual machine. in joyent they call the available bundled virtual machine images dataset and the virtual machine sizes as packages. if you are curious about what other do, you can check the documentation of pkgcloud for a unified vocabulary
sdc-listdatasets \\
-u https://us-east-1.api.joyentcloud.com \\
-a username \\
-k keyname \\
| json -a urn \\
| grep ubuntu
We are going to use the latest ubuntu, a.k.a. sdc:jpc:ubuntu-12.04:2.3.1
- sdc:sdc:ubuntu-10.04:1.0.1
- sdc:jpc:ubuntu-12.04-enstratus-public:2.0.2
- sdc:jpc:ubuntu-12.04:2.1.2
- sdc:admin:ubuntu-10.04-enstratus-public:1.0.1
- sdc:jpc:ubuntu-12.04:2.2.1
- sdc:jpc:ubuntu-12.04:2.3.1
Now to select the size of our virtual machine:
sdc-listpackages \\
-u https://us-east-1.api.joyentcloud.com \\
-a username \\
-k keyname \\
| json -a name
Here is the list as of today. to understand this fully check more details at the joyent website
- extra small 512 mb
- small 1gb
- medium 2gb
- medium 4gb
- large 8gb
- large 16gb
- xxl 48gb
- xl 32gb
Depending on how big your databases are you should select a different image. Unfortunately it seems like in joyent disk space is coupled with memory and number of vcpus, which is not great for couchdb. Feel free to reach out to them as ask them why (or for a custom build with more disk space)
In this tutorial i’m picking the medium 2gb for the hot standby replica, an a large 8gb for the live system
Now you can create your live couchdb:
sdc-createmachine \\
-u https://us-east-1.api.joyentcloud.com \\
-a username \\
-k keyname \\
--name couch-joyent-0 \\
--dataset sdc:jpc:ubuntu-12.04:2.3.1 \\
--package "Large 8GB"
This will output server details. make sure you log these somewhere
Now create the replica
sdc-createmachine \\
-u https://eu-ams-1.api.joyentcloud.com \\
-a username \\
-k keyname \\
--name couch-joyent-1 \\
--dataset sdc:jpc:ubuntu-12.04:2.3.1 \\
--package "Medium 2GB"
Ubuntu
Let’s start by connecting to our virtual machines. i would recommend iterm2 so you can browse between local, live and replica.
ssh root@165.255.222.111
ssh root@37.255.222.112
I would also change your ps1 so you can easily distinguish between the two machines:
vi ~/.bashrc
\# Edit the PS1 lines and replace with something like:
\# Live:
\# PS1='${debian\_chroot:+($debian\_chroot)}\\u@couch-live-us:\\w\\$ '
\# Replica:
\# PS1='${debian\_chroot:+($debian\_chroot)}\\u@couch-replica-eu:\\w\\$ '
. ~/.bashrc
Some ubuntu machines don’t ship with git and make, so let’s upgrade all our packages and install these two:
apt-get update
apt-get upgrade
apt-get install git make gcc build\-essential -y
These machines might not have node.js, so follow the steps you did before to install
CouchDB
i would recommend you follow the couchdb wiki on installing couchdb on ubuntu.
However i’m going to document here the exact steps i took
mkdir /opt/install
cd /opt/install
\# make sure you update this if a new version is out
wget http://mirrors.fe.up.pt/pub/apache/couchdb/releases/1.2.0/apache-couchdb-1.2.0.tar.gz
apt-get install -y erlang-dev erlang-manpages erlang-base-hipe erlang-eunit erlang-nox erlang-xmerl erlang-inets libmozjs185-dev libicu-dev libcurl4-gnutls-dev libtool
tar xvzf apache-couchdb-1.2.0.tar.gz
cd apache-couchdb-\*
./configure
make
make install
CouchDB is now built but we still need to create a user for couch to use, and set appropriate permissions and ownership
useradd -d /var/lib/couchdb couchdb
chown -R couchdb: /usr/local/var/{lib,log,run}/couchdb /usr/local/etc/couchdb
chmod 0770 /usr/local/var/{lib,log,run}/couchdb/
chmod 664 /usr/local/etc/couchdb/\*.ini
chmod 775 /usr/local/etc/couchdb/\*.d
Finally we want to set up init.d scripts so we can daemonize couchdb and manage it’s service like all other ubuntu processes
\# In case Ubuntu has some trash from default instalation
rm /etc/logrotate.d/couchdb /etc/init.d/couchdb
ln -s /usr/local/etc/logrotate.d/couchdb /etc/logrotate.d/couchdb
ln -s /usr/local/etc/init.d/couchdb /etc/init.d/couchdb
update-rc.d couchdb defaults
Let’s checkpoint here and make sure everything worked:
service couchdb start
curl localhost:5984
service couchdb stop
If something failed, it’s likely you will want to kill couchdb processes you left lying around. You can execute this command to crash all things related to couchdb ps -u couchdb -o pid= | xargs kill -9
Let’s put our couchdb running:
service couchdb start
stud
stud stands for the scalable tls unwrapping daemon, and it’s a great ssl terminator that works on top of libev and openssl
I decided not to expose couchdb via regular http. as for https stud will be our front end to couchdb.
Installing stud in ubuntu is incredibly simple:
apt-get install libev4 libssl-dev libev-dev -y
cd /opt/install
git clone git://github.com/bumptech/stud.git
cd stud
make
make install
stud doesn’t come bundled with all the nice things couchdb does, so we need to create similar artifacts:
mkdir /var/run/stud
mkdir /usr/local/var/run/stud
mkdir /usr/local/etc/stud
touch /usr/local/etc/stud/stud.conf
You will also need a valid certificate for the domain you wish to use to expose your couchdb database. Get the pemfile and place it in /usr/local/etc/stud/stud.pem. a pemfile will include a private key and certificate information
touch /usr/local/etc/stud/stud.pem
vi /usr/local/etc/stud/stud.pem
Let’s make sure we handle security properly:
useradd -d /var/lib/\_stud \_stud
chown \_stud: /usr/local/etc/stud/stud.pem
chown \_stud: /var/run/stud
chown -R \_stud: /usr/local/var/run/stud /usr/local/etc/stud
chmod 0770 /usr/local/var/run/stud/
chmod 664 /usr/local/etc/stud/\*.conf
chmod 600 /usr/local/etc/stud/stud.pem
mkdir /etc/stud
mkdir /etc/default
touch /etc/stud/stud.conf
Ubuntu has a init.d script for stud. however i had to tweak it a bit to make it work with a custom installation, namely because it checked for the daemon before allowing be to changed the configuration.
You can download the init.d script from this gist
rm /etc/init.d/stud
curl https://gist.github.com/dscape/4470972/raw/stud > /etc/init.d/stud
chmod +x /etc/init.d/stud
We installed stud from source and we need to provide the script the paths of our custom installation:
vi /etc/default/stud
In my case these where the changes i needed to make:
PATH=/usr/local/bin:/sbin:/usr/sbin:/bin:/usr/bin
DAEMON=/usr/local/bin/stud
CHROOT="/usr/local/var/run/stud"
COMMON\_OPTIONS="-r $CHROOT -u $USER --config /usr/local/etc/stud/stud.conf"
Final step is to put our stud configuration in /usr/local/etc/stud/stud.conf
frontend="\[\*\]:6984"
backend="\[127.0.0.1\]:5984"
pem-file="/usr/local/etc/stud/stud.pem"
ssl=on
workers=2
syslog=on
Start stud:
update\-rc.d stud defaults
service stud start
We can test this is working. go to your local machine and try it out:
$ curl 165.255.222.111:5984
curl: (7) couldn't connect to host
\# -k means ignore ssl errors, cause the certificate is for a domain not ip
$ curl https://165.255.222.111:6984 -k
{"couchdb":"Welcome","version":"1.2.0"}
You should do the same check for the replica database 37.255.222.112
Now go to your dns provider and make sure you point something likemy-ouch.mydomain.com to the ip of the machine (a record). do the same for your replica database. If you try to do curl against the domain you will see it now works with the -k option
An unsolicited advice: use multiple dns providers case one of them goes down. It happened once, might happen again
Configuring CouchDB
We now need to configure our couchdb server. by default it comes in admin party mode but normally we want couchdb to be accessible only with a valid username and password
Browse to your futon:
And click on fix this to create our admin username and password
This will create an admin user but futon will still be visible in a read only capacity without authentication. to force authentication you should edit the local.ini file:
vi /usr/local/etc/couchdb/local.ini
Add
**\[couchdb\]**
delayed\_commits = false
**\[couch\_httpd\_auth\]**
\# some lines before
require\_valid\_user = true
This will work on server restart, but since we are editing this file let’s add our auto-compaction configuration. compaction is a cpu/disk intensive operation so should be scheduled accordingly. The auto-compaction feature was introduced in couchdb 1.2.0. in here we are going to use a simple configuration, but i strongly advise you to check the documentation instead of blindly copying
**\[daemons\]**
compaction\_daemon={couch\_compaction\_daemon, start\_link, \[\]}
**\[compaction\_daemon\]**
check\_interval = 300
min\_file\_size = 131072
**\[compactions\]**
\_default = \[{db\_fragmentation, "70%"}, {view\_fragmentation, "60%"}, {from, "00:00"}, {to, "04:00"}, {strict\_window, true}\]
Having worked in previous databases that do compactions, I would advise you to have at least 2 cpu’s per database and do compactions when the writes in your database are only a few
Now restart couchdb:
service couchdb stop
service couchdb start
Ok, browse back to futon and see that it now requires username and password to access
Migrating the data
We now need to migrate our data from our production CouchDB to our new live system. For this we will set a replicator job that will continuously replicate from the active production couch
You don’t want to use futon for this, because continuous replications get cancelled on restart when done from futon. To do this right you need to use the _replicator database which was introduced in couchdb 1.2.0.
Just do this for each production database you want to continuously replicate.
function **register\_replication**() {
# $1 is database name
# $2 is https://localuser:localpass@localhost:6984
# $3 is https://remoteuser:remotepass@remote:6984
DATA='{"source": "'$3'/'$1'","target": "'$1'","connection\_timeout": 60000,"retries\_per\_request": 20,"http\_connections": 30, "continuous":true, "user\_ctx": { "roles": \[ "\_admin" \] }}'
echo "database: "$1
echo "local: "$2
echo "remote: "$3
echo "data: "$DATA
echo
echo "proceed? (control+c to cancel)"
read
curl -k -vX PUT $2/$1
curl \\
-X POST \\
-k \\
-H "Content-type: application/json" \\
$2/\_replicator \\
--data "$DATA"
}
You can now call this for each of the databases you want to replicate
**register\_replication** \\
foobar \\
https://u1:pw2@localhost:6984 \\
[https://u2:pw2@my-couch.mydomain.com:6984](https://u2:pw2@my-couch.mydomain.com:6984)
And for your replica:
**register\_replication** \\
foobar \\
https://u3:pw3@my-couch-replica.mydomain.com:6984
https://u1:pw2@localhost:6984 \\
Pulling the switch
When we decide to migrate to the new couch we can change the pointer to the new database and the old one will stop getting documents. after that we can remove the replications from the new live system. As for our replica, will will pull from our new live system and will be in standby mode always
You will need to change your configuration files so the correct server gets called. however, before doing that it is advisable that you start up your views. In couchdb are only first created on the first request. This means that if you migrate your system and you have a lot of traffic the first couple of requests will probably timeout, which is not that great
We need to connect to both the live and replica servers and make sure all views are created. (sidenote: if you are also adding new stuff to design documents don’t forget to do it right or have exactly the same problem as described above)
You can now build the couchdb views by using the couchdb-build-views script:
npm install -g couchdb-build\-views
couchdb-build\-views --help
Now just call the script:
couchdb-build-views --couch https://u3:pw3@my-couch-replica.mydomain.com:6984
couchdb-build-views --couch [https://u2:pw2@my-couch.mydomain.com:6984](https://u2:pw2@my-couch.mydomain.com:6984)
Now that you are done, don’t forget to delete that silly directory and clean your history:
cd
rm -rf ~/deletemelater/
rm ~/.bash\_history
history -c
touch ~/.bash\_history
We are all done and ready for a new adventure: test this new environment in terms of load, and api. So don’t forget to check the couchdb changelog and test appropriately before switching
Written by: Nuno Job
Originally published at blog.yld.io on March 26, 2014.
Written by YLD • March 26th, 2014
- Couchdb
- Nodejs
Share this article