Any developer who has ever been tasked with consolidating data sources before performing analysis knows the challenge of silos, especially when data is sprawled out across multiple third-party services. But what about if you had to help a client pull application data from 47 different MySQL databases and then consolidate it into an operational data … Continue reading Plug and Play Data Movement in the Cloud
Replicating Amazon RDS MySQL to Amazon Redshift w/ Attunity CloudBeam
Bytecode IO recently helped one of its clients transition from a data infrastructure that was having some serious growing pains, causing frequent crashes and an overall loss of data insight, to a warehoused structure that could handle multiple data sources and provide improved insight. The catch was, as is so often the case, that time … Continue reading Replicating Amazon RDS MySQL to Amazon Redshift w/ Attunity CloudBeam
Crash Recovery With Beaver v.31, SQLite3 and Logstash v.1.4.2
As anyone who has worked with log data can tell you, getting that data stored and sorted in a searchable fashion in anything approaching real-time is a set of tasks best left to automated processes for the good of both efficiency and sanity. However, every so often, processes hang, software crashes and somebody has to … Continue reading Crash Recovery With Beaver v.31, SQLite3 and Logstash v.1.4.2
Solving OOM by Resizing MySQL’s innodb_buffer_pool_size
Several developers at one of Bytecode IO’s clients reached out for help after some of their scripts began mysteriously dying. These scripts connected to a MySQL 5.6 server the developers had configured and were doing large SQL transforms. After examining the configured innodb_buffer_pool_size, we found that it was set to 92% of RAM and the … Continue reading Solving OOM by Resizing MySQL’s innodb_buffer_pool_size
Forking the Mixpanel Data API Client
One of Bytecode IO’s clients is moving from Mixpanel to an AWS Redshift data warehouse. I was tasked with extracting the data out of the Mixpanel API and loading it into Redshift. As the client wanted to slowly reduce their use of Mixpanel, rather than a single dump and load, we went with an hourly … Continue reading Forking the Mixpanel Data API Client