Plug and Play Data Movement in the Cloud


Any developer who has ever been tasked with consolidating data sources before performing analysis knows the challenge of silos, especially when data is sprawled out across multiple third-party services. But what about if you had to help a client pull application data from 47 different MySQL databases and then consolidate it into an operational data … Continue reading Plug and Play Data Movement in the Cloud

Replicating Amazon RDS MySQL to Amazon Redshift w/ Attunity CloudBeam


Bytecode IO recently helped one of its clients transition from a data infrastructure that was having some serious growing pains, causing frequent crashes and an overall loss of data insight, to a warehoused structure that could handle multiple data sources and provide improved insight. The catch was, as is so often the case, that time … Continue reading Replicating Amazon RDS MySQL to Amazon Redshift w/ Attunity CloudBeam

Crash Recovery With Beaver v.31, SQLite3 and Logstash v.1.4.2


As anyone who has worked with log data can tell you, getting that data stored and sorted in a searchable fashion in anything approaching real-time is a set of tasks best left to automated processes for the good of both efficiency and sanity.  However, every so often, processes hang, software crashes and somebody has to … Continue reading Crash Recovery With Beaver v.31, SQLite3 and Logstash v.1.4.2

Solving OOM by Resizing MySQL’s innodb_buffer_pool_size


Several developers at one of Bytecode IO’s clients reached out for help after some of their scripts began mysteriously dying. These scripts connected to a MySQL 5.6 server the developers had configured and were doing large SQL transforms. After examining the configured innodb_buffer_pool_size, we found that it was set to 92% of RAM and the … Continue reading Solving OOM by Resizing MySQL’s innodb_buffer_pool_size

Forking the Mixpanel Data API Client


One of Bytecode IO’s clients is moving from Mixpanel to an AWS Redshift data warehouse.  I was tasked with extracting the data out of the Mixpanel API and loading it into Redshift.  As the client wanted to slowly reduce their use of Mixpanel, rather than a single dump and load, we went with an hourly … Continue reading Forking the Mixpanel Data API Client