{"id":21542,"date":"2014-04-13T13:29:01","date_gmt":"2014-04-13T11:29:01","guid":{"rendered":"https:\/\/mamchenkov.net\/wordpress\/?p=21542"},"modified":"2014-04-13T13:29:11","modified_gmt":"2014-04-13T11:29:11","slug":"scaling-the-facebook-data-warehouse-to-300-pb","status":"publish","type":"post","link":"https:\/\/mamchenkov.net\/wordpress\/2014\/04\/13\/scaling-the-facebook-data-warehouse-to-300-pb\/","title":{"rendered":"Scaling the Facebook data warehouse to 300 PB"},"content":{"rendered":"<!-- google_ad_section_start -->\n<p><a href=\"https:\/\/code.facebook.com\/posts\/229861827208629\/scaling-the-facebook-data-warehouse-to-300-pb\/\">Scaling the Facebook data warehouse to 300 PB<\/a><\/p>\n<blockquote><p>At Facebook, we have unique storage scalability challenges when it comes to our data warehouse. Our warehouse stores upwards of 300 PB of Hive data, with an incoming daily rate of about 600 TB. In the last year, the warehouse has seen a 3x growth in the amount of data stored. Given this growth trajectory, storage efficiency is and will continue to be a focus for our warehouse infrastructure.<\/p><\/blockquote>\n<!-- google_ad_section_end -->\n","protected":false},"excerpt":{"rendered":"<!-- google_ad_section_start -->\n<p>Scaling the Facebook data warehouse to 300 PB At Facebook, we have unique storage scalability challenges when it comes to our data warehouse. Our warehouse stores upwards of 300 PB of Hive data, with an incoming daily rate of about 600 TB. In the last year, the warehouse has seen a 3x growth in the &hellip; <a href=\"https:\/\/mamchenkov.net\/wordpress\/2014\/04\/13\/scaling-the-facebook-data-warehouse-to-300-pb\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Scaling the Facebook data warehouse to 300 PB<\/span><\/a><\/p>\n<!-- google_ad_section_end -->\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"link","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"_links_to":"","_links_to_target":""},"categories":[1,62],"tags":[1192,1559,2271,1057,1281,3212],"keyring_services":[],"class_list":["post-21542","post","type-post","status-publish","format-link","hentry","category-general","category-technology","tag-computer-science","tag-databases","tag-facebook","tag-performance","tag-scalability","tag-storage","post_format-post-format-link"],"aioseo_notices":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":23901,"url":"https:\/\/mamchenkov.net\/wordpress\/2015\/04\/16\/data-gravity\/","url_meta":{"origin":21542,"position":0},"title":"Data Gravity","author":"Leonid Mamchenkov","date":"April 16, 2015","format":false,"excerpt":"On the drive back home today I was listening to DevOps Cafe podcast, episode 59. \u00a0I've recently subscribed to this show and I think this was the first episode of it I ever heard. \u00a0It's one of many tech talk podcasts, where two or more people chat for a varied\u2026","rel":"","context":"In &quot;All&quot;","block_context":{"text":"All","link":"https:\/\/mamchenkov.net\/wordpress\/category\/general\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":22473,"url":"https:\/\/mamchenkov.net\/wordpress\/2014\/09\/03\/extract-transform-load\/","url_meta":{"origin":21542,"position":1},"title":"Extract, Transform, Load","author":"Leonid Mamchenkov","date":"September 3, 2014","format":false,"excerpt":"I've been doing all kinds of data migrations and system integration for years now. \u00a0But only yesterday I've learned that there is a very specific term linked to the process. In computing, extract, transform, and load (ETL) refers to a process in database usage and especially in data warehousing that:\u2026","rel":"","context":"In &quot;All&quot;","block_context":{"text":"All","link":"https:\/\/mamchenkov.net\/wordpress\/category\/general\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":27314,"url":"https:\/\/mamchenkov.net\/wordpress\/2017\/02\/09\/mcrouter-a-memcached-protocol-router\/","url_meta":{"origin":21542,"position":2},"title":"Mcrouter: a memcached protocol router","author":"Leonid Mamchenkov","date":"February 9, 2017","format":false,"excerpt":"Mcrouter is an Open Source tool developed by Facebook for scaling up the memcached deployments: Mcrouter is a memcached protocol router for scaling memcached (http:\/\/memcached.org\/) deployments. It's a core component of cache infrastructure at Facebook and Instagram where mcrouter handles almost 5 billion requests per second at peak. Here is\u2026","rel":"","context":"In &quot;All&quot;","block_context":{"text":"All","link":"https:\/\/mamchenkov.net\/wordpress\/category\/general\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/mamchenkov.net\/wordpress\/wp-content\/uploads\/2017\/02\/mcrouter-500x375.png?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":27030,"url":"https:\/\/mamchenkov.net\/wordpress\/2016\/12\/01\/amazon-snowmobile-a-truck-with-up-to-100-petabytes-of-storage\/","url_meta":{"origin":21542,"position":3},"title":"Amazon Snowmobile &#8211; a truck with up to 100 Petabytes of storage","author":"Leonid Mamchenkov","date":"December 1, 2016","format":false,"excerpt":"Back in my college days, I had a professor who frequently used Andrew Tanenbaum's quote in the networking class: Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway. I guess he wasn't the only one, as during this year's Amazon re:Invent 2016 conference, the\u2026","rel":"","context":"In &quot;All&quot;","block_context":{"text":"All","link":"https:\/\/mamchenkov.net\/wordpress\/category\/general\/"},"img":{"alt_text":"aws-snowmobile","src":"https:\/\/i0.wp.com\/mamchenkov.net\/wordpress\/wp-content\/uploads\/2016\/12\/AWS-Snowmobile-500x313.png?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":12790,"url":"https:\/\/mamchenkov.net\/wordpress\/2010\/07\/23\/on-scalability-of-mysql\/","url_meta":{"origin":21542,"position":4},"title":"On scalability of MySQL","author":"Leonid Mamchenkov","date":"July 23, 2010","format":false,"excerpt":"Anyone who says that MySQL is not scalable has no idea. \u00a0Facebook is one of the examples for a large deployment of MySQL: How big is Facebook\u2019s Internet infrastructure? Facebook VP of Technology Jeff Rothschild provided some details in a panel at the recent MySQL user conference. Rothschild says Facebook\u2026","rel":"","context":"In &quot;All&quot;","block_context":{"text":"All","link":"https:\/\/mamchenkov.net\/wordpress\/category\/general\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":28507,"url":"https:\/\/mamchenkov.net\/wordpress\/2018\/04\/20\/distributed-architecture-concepts-i-learned-while-building-a-large-payments-system\/","url_meta":{"origin":21542,"position":5},"title":"Distributed architecture concepts I learned while building a large payments system","author":"Leonid Mamchenkov","date":"April 20, 2018","format":false,"excerpt":"Gergely Orosz, an engineer who worked at Uber on the large scale payments system used by the company, shares some of the distributed architecture concepts he had to learn in the blog post titled \"Distributed architecture concepts I learned while building a large payments system\". The article is very well\u2026","rel":"","context":"In &quot;All&quot;","block_context":{"text":"All","link":"https:\/\/mamchenkov.net\/wordpress\/category\/general\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_sharing_enabled":true,"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/mamchenkov.net\/wordpress\/wp-json\/wp\/v2\/posts\/21542","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mamchenkov.net\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mamchenkov.net\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mamchenkov.net\/wordpress\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mamchenkov.net\/wordpress\/wp-json\/wp\/v2\/comments?post=21542"}],"version-history":[{"count":0,"href":"https:\/\/mamchenkov.net\/wordpress\/wp-json\/wp\/v2\/posts\/21542\/revisions"}],"wp:attachment":[{"href":"https:\/\/mamchenkov.net\/wordpress\/wp-json\/wp\/v2\/media?parent=21542"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mamchenkov.net\/wordpress\/wp-json\/wp\/v2\/categories?post=21542"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mamchenkov.net\/wordpress\/wp-json\/wp\/v2\/tags?post=21542"},{"taxonomy":"keyring_services","embeddable":true,"href":"https:\/\/mamchenkov.net\/wordpress\/wp-json\/wp\/v2\/keyring_services?post=21542"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}