Ansible managed Aerospike database clusters makes our engineers smile. Now we would like to share the same tools we use with the rest of the world. That is why today we announce the release of three open source Ansible roles to deploy and manage Aerospike clusters on AWS.
In geospatial queries, we often need to quickly find all the points of interests (POIs) within a certain distance from an anchor point. In this post, we present a simple method that scales very well for billions of data points and implemented using plain SQL; so it can be deployed on a massive data processing systems like Redshift or Hive/SparkSQL on Hadoop without utilizing any geospatial support components.
Hadoop 2.x upgrades the previous web UI with a detailed ResourceManager. Having previously browsed the simpler JobTracker UI of Hadoop 1.x using lynx on the master node, finding things on the new interface took a bit of experimentation.
Our systems at Thinknear run on a 24/7 basis and monitoring them for errors is essential to prevent production issues that can have a negative impact on our business. One of the tools we use for error reporting is Honeybadger.
At Thinknear we believe automated tests are essential.
Redshift clusters need to accommodate tables and views created not only by our applications but by our operations and data science teams. It is quite common for user defined tables and views to rely on application defined tables and views, which makes migrations a challenge. In the following post, we present two SQL queries that are useful when trying to identify dependencies before running migrations.
We are hiring like crazy here at Thinknear. (Current openings on our careers page.) We're solving massive scale challenges in the hundreds of thousands of requests per second, pressing databases to the limit, and we have more data than we know what to do with. As a result, we're looking for engineers, data scientists, and managers.
At Thinknear we always want to make sure we are doing our best to use the right tool for the job. So when Redshift came out we decided to evaluate our current reporting and analytics pipeline and see if Redshift could help us improve. At the time we were using Hive/Hadoop on EMR for all our reporting and analytics purposes. We saw Redshift as a way to speed up our reporting infrastructure without completely rearchitecting and give our business team a much easier way to access the data. Given these goals we evaluated Redshift against our current Hive/Hadoop solution and found the following pros and cons.
At ThinkNear, we have an in-house administrative dashboard that our ad operations team uses to set up and manage ad campaigns. The dashboard is an AngularJS frontend with a Ruby on Rails backend, with the ui-router plugin for permalinks and navigation. While ngNewsletter's Diving deep into the AngularUI Router was a helpful primer, we found it didn't go deep enough.
During the early days of Thinknear, Resque was the most prevalent background job processor for our Rails applications. However, Resque was not multithread-friendly, and, as our applications grew, this put a toll on our Heroku monthly bill.
Thinknear was privileged to be invited to present at AWS re:Invent 2014. Our topic was on how we have scaled to billions of daily requests on Elastic Beanstalk.
Thinknear is releasing our aws_templates as an open-source project under APLv2. aws_templates is an example deployment and configuration setup for AWS Elastic Beanstalk.
Thinknear is delighted to announce that our tn_s3_file_uploader is released as open-source under APLv2.tn_s3_file_uploader is a Ruby gem that we use internally to upload log files to Amazon S3 where they can be stored until we need to retrieve them for further analysis or processing.
At Thinknear, we strive to measure anything that will help us understand the behavior of our systems in production. Collectd is a great tool for us to track instance level statistics that CloudWatch does not provide.
Software development is hard. That’s a fact. Not only do we have a product that we must enhance, develop from scratch, or even maintain, but we have people’s interactions, opinions, emotions, the list goes on. And then there is design, technology and execution. All these factors, all these variables make software development the fantastic challenge that it is.