Hosted Elasticsearch

Hosted Elasticsearch

Hosted Elasticsearch provides you with the awesomeness that is Elasticsearch without the headaches of managing the server.

I’ve blogged about installing Elasticsearch and securing Elasticsearch. But what about hosted Elasticsearch? There are several great options out there if you don’t want to deal with managing a server, updates, security, and all the fun that comes along with running your own managed services.

Most of these provide fancy management interfaces, one click scaling, SSD options, and built in security.

Qbox.io

Qbox.io provides managed cloud hosting for Elasticsearch on dedicated clusters running on EC2, Rackspace, or Softlayer.

Qbox is a dedicated hosting solution for Elasticsearch that aims to be as simple and intuitive as possible for the application developer. Conceptually, what we provide is similar to any other cloud-hosted database. Through our dashboard, you can launch your own Elasticsearch “cluster”. A cluster is a collection of servers behaving as “nodes”, which can be added/removed on the fly for scale.

Securing elasticsearch can be a pain in your ass. Trust me. But Qbox handles all that for you with free SSL by default, IP whitelisting, and HTTP Basic authentication options. And Qbox works with Elasticsearch sister products Kibana and Logstash.

While there is no free pricing tier, they do give you $35 in credits to use upon signing up just to make sure the service fits your needs. For about $32/mo you can run a 512MB RAM and 20GB Disk server on Rackspace. They also offer SSD plans.

For what it’s worth, of this list of service providers, Qbox is the only one listed as an official Elasticsearch partner.

Found

Found allows you to build your own dedicated hosted Elasticsearch cluster in minutes. Found’s infrastructure runs on Amazon’s EC2 over several different regions with SSD options and fully configurable specifications. There’s also the option to add fault tolerance for spanning your cluster up to 3 data centers.

Found also does not provide a free tier, instead offering a 14 day free trial. However unlike Qbox, a credit card is not required to get started.

The cheapest plan runs at about $30/mo in a single data center with 256MB RAM and 2GB of storage. But Found only charges for what you actually use.

You only pay for reserved memory and disk, and usage is billed by the hour. This gives you the flexibility to create temporary test and development clusters – when you’re ready to move on, you simply delete them, hence you stop paying for them.

All communication runs over SSL and their clusters run in isolated containers giving you the options to create custom Access Control Lists which gives you fine control over who and what can access your cluster. Additionally, Found will keep a continuos backup of your indices on Amazon S3.

Lastly, they maintain a fantastic blog on all things Elasticsearch.

Bonsai

Bonsai is yet another option for hosted Elasticsearch that has similar features to the aforementioned services. It provides easy setup on super fast SSD machines that are replicated in triplicate and backed up daily. All plans include Multi-Zone replication, SSL, secure API access, continuous backups and monitoring.

The cheapest plan at Bonsai is $50/mo for 1GB RAM and 10GB of storage.

IndexDepot

IndexDepot out of Berlin, Germany has an inexpensive hosted Elasticsearch option to get started. For $15/mo a “Micro” instance includes 1 managed index, 10,000 documents, 1 GB storage with a free 30 day trial. Their infrastructure, like most of the others, runs on Amazon Web Services. They offer similar features to the other solution including access control, monitoring, backups, and more.

Facet Flow

Facet Flow is hosted Elasticsearch for Microsoft Azure. Facet flow has similar features as the above service providers although nowhere on their site do they explicitly address security – which I think is one of the more important aspects of choosing a hosted Elasticsearch provider.

One thing Facet Flow does offer that’s unlike all the other services listed is a “Sandbox” plan giving you free access to an index that can store 5,000 documents, 500MB of storage, 1 shard, and 0 replicas. Not something you’d run in production – but definitely an option to get started.

Searchly

Aside from Facet Flow’s sandbox plan, Searchly provides the cheapest option to get up and running. For $9/mo you can manage 3 indices with 100MB SSD storage, Access Control List, SSL, and complete API access. And they have a free plan where you can run 2 indices and 5MB of storage. Searchly runs on the US data centers of Amazon EC2.

FWIW – Searchly provides a pretty slick Elasticsearch plugin for WordPress.

Summary

There seems to be new hosted Elasticsearch options popping up all the time. If I’ve missed any, please let me know in the comments. But I think these are the front runners and all offer similar solutions and level of service. Your choice will likely depend on budget and available features.

If I had to choose for you at this point in time, I’d suggest Found or Qbox as they appear to be the clear leaders in the space.

Securing Elasticsearch

Securing Elasticsearch

Securing Elasticsearch is extremely important if you are running it in production. I learned the hard way.

By design, security is not built into Elasticsearch. They leave it up to you as the developer to implement the right security for your environment. This is both good and bad.

I thought security through obscurity would work for a short time – I was wrong.

Earlier this week someone used the dynamic scripting exploit on a publicly accessible Elasticsearch server to inject a script that in turn initiated a series of DDOS attacks from our internal network to Chinese websites crippling our office network for half a day. Securing elasticsearch then became a priority :)

In this post, we’ll discuss several options for securing Elasticsearch including patching, firewalls, iptables, nginx reverse proxy and the Jetty plugin.

1. Keep Elasticsearch Updated

This is one of the reasons one of my Elasticsearch indexes got compromised. We were running version 1.1 in production and like many that support systems in production you want to set it and forget it. It’s working fine – don’t touch it! So I hadn’t yet gotten around to upgrading to Elasticsearch 1.2.

Guess what – ES 1.2 included an important security update that turns off the dynamic scripting feature by default. Running a publicly accessible ES server with dynamic scripting turned on is asking for trouble. This is what was used to turn our Elasticsearch server into a network crippling DDOS zombie.

Even if you do use any of the solutions below to protect your ES server, it’s still a good idea to keep it updated (not to mention getting all the new awesome features!)

TL;DR: Keep your Elasticsearch install up to date

2. Put Elasticsearch behind the firewall

This is the easiest solution: Don’t have a publicly accessible Elasticsearch server. If you are running Elasticsearch in the cloud like on EC2 or in the enterprise, you can setup firewall rules to allow only the IPs of your development and server environment access it with firewall rules. Only allowing access to port 9200 / 9300 from whitelisted IPs is a great way to secure your Elasticsearch server.

Another option is to use iptables, which is available with most Linux distros, and offers similar features of a firewall for a single server.

While this is probably ‘well, duh’ for you it was something I thought could wait – security through obscurity as I like to call it. I had an Elasticsearch server running publicly so a group of distributed developers could access it. I didn’t want to deal with setting up firewall rules for everyone’s IP address that may change tomorrow and then again next week. I thought I’d wait until we roll it out to production to lock it down.

TL;DR: Use a firewall or iptables to shield your Elasticsearch server from the public internets.

3. Use NGINX for Reverse Proxy with HTTP Authentication

Sometimes you want your Elasticsearch server to be reached from outside your network. This is a good solution if you have a distributed team of devs, use cloud hosting, or just want to expose your data to a select group of outside people. Using this method you can also setup multiple usernames and passwords that can access your index, giving you finer control on granting and revoking access.

And this is super easy to setup if you are running Elasticsearch as a Linux service. Simply install NGINX, setup HTTP Authentication, and update a couple config files, and setup your Elasticsearch config to run on localhost (this is default when you install Elasticsearch). There is a bonus if you are using NGINX in that you have more control and can run everything through SSL.

Here’s the NGINX configuration to run it as a reverse proxy to use HTTP authentication in front of your Elasticsearch service.

TL;DR: use NGINX as a reverse proxy for elasticsearch running HTTP authentication on requests

4. Anonymous Read Only Access

You may want your Elasticsearch data open to the public – but you don’t want . Perhaps you are running a public API where you want to use ES to expose data. Maybe you are serving up recipes or some sort of aggregate data to other developers or publishing the data to public websites. Or you are using the elasticsearch-js client and want to use Angular as a front end (or Kibana!) interface to your index.

There are a couple ways to do this. First, you can setup an NGINX reverse proxy (similar to #3 above) except you don’t need to setup authentication… You can simply only allow GET requests, using the following NGINX config (via stackoverflow)

You can also setup the NGINX config to only allow requests to the _search endpoint in Elasticsearch.

What if you don’t want to deal with setting up NGINX proxy? There is an Elasticsearch plugin called Readonly REST Elasticsearch Plugin by Simone Scarduzio that will accomplish this for you.

This plugin makes possible to expose the high performance HTTP server embedded in Elasticsearch directly to the public denying the access to the API calls which may change any data.

 

No more proxies! Yay Ponies!

TL;DR: only allow readonly access or access to the Search API for public anonymous access

5. Use the Jetty Plugin

The elasticsearch-jetty plugin adds a few awesome features to your Elasticsearch server including SSL support, basic authentication, request logging, and Gzip compression of responses.

The elasticsearch-jetty plugin brings full power of Jetty and adds several new features to elasticsearch. With this plugin elasticsearch can now handle SSL connections, support basic authentication, and log all or some incoming requests in plain text or json formats.

6. Elasticsearch HTTP Basic Auth Plugin

The elasticsearch-http-basic plugin is similar to Jetty without all the additional features. It simply handles basic authentication. This is simpler to setup and use than Jetty but lacks the additional features that make Jetty awesome. But if you just want auth – give it a shot.

Summary

There is no standard way for securing Elasticsearch – it depends on your environment and requirements. But any one of these five ways will be better than just firing up an Elasticsearch index out on the public internet.

Don’t want to worry about all this fun stuff? Check out some hosted Elasticsearch options.

Elasticsearch Security Resources

Install Elasticsearch in 5 Minutes

Install ElasticSearch In 5 Minutes

This is a short tutorial to install Elasticsearch in 5 minutes on Ubuntu in a Digital Ocean droplet.

I’ve been working with WordPress for a long time and what really got me hooked in the early days was the “Famous 5-Minute Install”. I’m extending that same concept to one of my new favorite tools - Elasticsearch. It’s a super fast search service built on Lucene that has an embedded RESTful JSON API.

Since it’s native JSON, any object you have in your code – whether it be a Javascript object or a C# object – can be serialized and inserted into an Elasticsearch Index. So technically you can use it as a NoSQL database. It clusters and does a lot of other fancy stuff but that’s not the point of this article. Anyway you probably already know what it is if you stumbled on this post so lets get your very own Elastic Search sandbox up and running…

Step 1: Get A Server

Screen Shot 2014-05-23 at 1.23.28 PMIn order to get this done in 5 minutes we’re going to use Digital Ocean to spin up a cloud server. Why? Because it’s awesome and your server will be ready in 55 seconds… It’s cheap to run and free to get started if you use one of their many promo codes. If this doesn’t sound awesome to you, feel free to spend an hour or so setting up a Linux virtual machine. Either way, this tutorial assumes you are going to run ElasticSearch on Linux, specifically Ubuntu.

So after you sign up for Digital Ocean, setup a free  Ubuntu Droplet (more info than you need is here). They’ll email you the root password and you should be good to go to access the Linux console from their website.

Screen Shot 2014-05-23 at 1.31.38 PM

Note: there are a bunch of other things you’ll want to do if you run this server in production – like setting up SSH, disabling root login, and other things. Follow this tutorial for ‘Initial Server Setup With Ubuntu‘ for more details.

Step 2: Install Elasticsearch

Now you are ready to install Elasticsearch. Fortunately that’s the easy part. Run the shell script in this gist to get up and running.

Aaaand you’re done.

Want to make sure it’s running? Run a curl in your console, hitting port 9200.

curl http://localhost:9200

You should see something like this giving you some meta data about your Elasticsearch instance.

 

Install ElasticSearch In 5 Minutes

Now, if I had DNS setup for this hostname, you will now be able to hit Elasticsearch externally with http://elastic.brudtkuhl.com:9200 but for now you can just go at the public IP address that Digital Ocean provides.

This is the first in a series of posts on my experiences working with Elasticsearch. Do you have any questions on how to install Elasticsearch?

Note: Links to DigitalOcean use my referral code. I’d recommend them if I didn’t have a referral code.