<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/'><id>tag:blogger.com,1999:blog-6576681508814660796.post3147996761905650682..comments</id><updated>2012-01-14T17:33:57.929-05:00</updated><category term='ruby'/><category term='me'/><category term='document collaboration'/><category term='pixel perfect'/><category term='next level'/><category term='soccer'/><category term='retrospective'/><category term='collaboration'/><category term='cricket'/><category term='development'/><category term='team location'/><category term='india'/><category term='open source'/><category term='following'/><category term='hiring'/><category term='leading'/><category term='scrum'/><category term='war for talent'/><category term='SAS'/><category term='agile'/><category term='Gephi'/><category term='onshore'/><category term='offshore'/><category term='design'/><category term='vrp'/><category term='fun'/><category term='Strataconf'/><category term='product customizations'/><category term='learning'/><category term='high performing teams'/><category term='Social Network Analysis'/><category term='R'/><category term='vehicle routing problem'/><title type='text'>Comments on Enterprise Software Doesn't Have to Suck: Big data problems</title><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://www.rcasts.com/feeds/3147996761905650682/comments/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default'/><link rel='alternate' type='text/html' href='http://www.rcasts.com/2011/04/big-data-problems.html'/><author><name>prasoonsharma</name><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>13</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-6576681508814660796.post-2458051590713174442</id><published>2012-01-14T17:33:57.929-05:00</published><updated>2012-01-14T17:33:57.929-05:00</updated><title type='text'>Think about this:  for the cost of an Intel 160Gb ...</title><content type='html'>Think about this:  for the cost of an Intel 160Gb SSD (320 or 510), one gets near RAM speeds from &amp;quot;disk files&amp;quot;.  I would expect that someone (not I, as my C is very old and clunky) will, in time, build a package leverages SSDs.  As to just loading up on memory, most PCs these days ship with all DIMM slots filled with the maximum supported DIMM.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/2458051590713174442'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/2458051590713174442'/><link rel='alternate' type='text/html' href='http://www.rcasts.com/2011/04/big-data-problems.html?showComment=1326580437929#c2458051590713174442' title=''/><author><name>Robert Young</name><uri>http://www.blogger.com/profile/09056808374481236610</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='33' height='26' src='http://1.bp.blogspot.com/-Er9cWHhacCw/TgFoChg_y8I/AAAAAAAAACI/vqa9fbA02ko/s220/mecrop.jpg'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.rcasts.com/2011/04/big-data-problems.html' ref='tag:blogger.com,1999:blog-6576681508814660796.post-3147996761905650682' source='http://www.blogger.com/feeds/6576681508814660796/posts/default/3147996761905650682' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-212003342'/></entry><entry><id>tag:blogger.com,1999:blog-6576681508814660796.post-442551577509126744</id><published>2011-10-05T10:46:58.653-04:00</published><updated>2011-10-05T10:46:58.653-04:00</updated><title type='text'>Since memory is so cheap now, just install more me...</title><content type='html'>Since memory is so cheap now, just install more memory on your machine so that you don&amp;#39;t need a VM solution for Stata or R.  I regularly work on a dataset that is &amp;gt;30G on Stata on a desktop computer that has 32G of RAM installed.  As long as your RAM is greater than the dataset size, it will run fine.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/442551577509126744'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/442551577509126744'/><link rel='alternate' type='text/html' href='http://www.rcasts.com/2011/04/big-data-problems.html?showComment=1317826018653#c442551577509126744' title=''/><author><name>Anonymous</name><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img1.blogblog.com/img/blank.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.rcasts.com/2011/04/big-data-problems.html' ref='tag:blogger.com,1999:blog-6576681508814660796.post-3147996761905650682' source='http://www.blogger.com/feeds/6576681508814660796/posts/default/3147996761905650682' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-2042539779'/></entry><entry><id>tag:blogger.com,1999:blog-6576681508814660796.post-1271742319055122950</id><published>2011-05-20T15:21:06.725-04:00</published><updated>2011-05-20T15:21:06.725-04:00</updated><title type='text'>I&amp;#39;m looking at Revolution R as well and like w...</title><content type='html'>I&amp;#39;m looking at Revolution R as well and like what I&amp;#39;ve seen so far. Their desktop product is easy to get up and running with and is really fast for analyzing 100s of millions of records. &lt;br /&gt;&lt;br /&gt;I&amp;#39;m also playing with FF package and will report back findings in a few weeks.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/1271742319055122950'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/1271742319055122950'/><link rel='alternate' type='text/html' href='http://www.rcasts.com/2011/04/big-data-problems.html?showComment=1305919266725#c1271742319055122950' title=''/><author><name>prasoonsharma</name><uri>http://www.blogger.com/profile/08307833695144399261</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.rcasts.com/2011/04/big-data-problems.html' ref='tag:blogger.com,1999:blog-6576681508814660796.post-3147996761905650682' source='http://www.blogger.com/feeds/6576681508814660796/posts/default/3147996761905650682' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-309575915'/></entry><entry><id>tag:blogger.com,1999:blog-6576681508814660796.post-2934046988187614109</id><published>2011-05-19T08:53:35.678-04:00</published><updated>2011-05-19T08:53:35.678-04:00</updated><title type='text'>Revolution Analytics develops an enterprise versio...</title><content type='html'>Revolution Analytics develops an enterprise version of R that is specifically catered to working with Big Data and parallel processing.  You might want to give them a look.&lt;br /&gt;&lt;br /&gt;http://www.revolutionanalytics.com/</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/2934046988187614109'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/2934046988187614109'/><link rel='alternate' type='text/html' href='http://www.rcasts.com/2011/04/big-data-problems.html?showComment=1305809615678#c2934046988187614109' title=''/><author><name>Larry (IEOR Tools)</name><uri>http://www.ieortools.com</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img1.blogblog.com/img/blank.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.rcasts.com/2011/04/big-data-problems.html' ref='tag:blogger.com,1999:blog-6576681508814660796.post-3147996761905650682' source='http://www.blogger.com/feeds/6576681508814660796/posts/default/3147996761905650682' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-727315540'/></entry><entry><id>tag:blogger.com,1999:blog-6576681508814660796.post-2025908155691701683</id><published>2011-04-27T11:00:48.002-04:00</published><updated>2011-04-27T11:00:48.002-04:00</updated><title type='text'>Thank you everyone for the suggestions. I&amp;#39;m in...</title><content type='html'>Thank you everyone for the suggestions. I&amp;#39;m investigating a few solutions and will report back my findings. cheers</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/2025908155691701683'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/2025908155691701683'/><link rel='alternate' type='text/html' href='http://www.rcasts.com/2011/04/big-data-problems.html?showComment=1303916448002#c2025908155691701683' title=''/><author><name>prasoonsharma</name><uri>http://www.blogger.com/profile/08307833695144399261</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.rcasts.com/2011/04/big-data-problems.html' ref='tag:blogger.com,1999:blog-6576681508814660796.post-3147996761905650682' source='http://www.blogger.com/feeds/6576681508814660796/posts/default/3147996761905650682' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-309575915'/></entry><entry><id>tag:blogger.com,1999:blog-6576681508814660796.post-5326480089583318700</id><published>2011-04-26T11:49:41.048-04:00</published><updated>2011-04-26T11:49:41.048-04:00</updated><title type='text'>Rick from SAS here. I think that the 2009 ASA Data...</title><content type='html'>Rick from SAS here. I think that the 2009 ASA Data Expo (http://stat-computing.org/dataexpo/2009/posters/) really helped expose many statistical programmers to the magnitude of data that corporations have to analyze every day.  Taking part in the Expo was definitely an eye-opening experience for me, and it was fun to use SAS to analyze such a massive data set. For a summary, see http://support.sas.com/publishing/authors/extras/Wicklin_scgn-20-2.pdf&lt;br /&gt;&lt;br /&gt;In the open source world, Kane and Emerson&amp;#39;s bigmemory package (http://www.bigmemory.org/) is a great addition to the R arsenal. For his work on bigmemory, Kane was awarded the 2010 Chambers Award by the ASA Sections on Statistical Computing and Statistical Graphics.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/5326480089583318700'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/5326480089583318700'/><link rel='alternate' type='text/html' href='http://www.rcasts.com/2011/04/big-data-problems.html?showComment=1303832981048#c5326480089583318700' title=''/><author><name>Rick Wicklin</name><uri>http://blogs.sas.com/iml</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img1.blogblog.com/img/blank.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.rcasts.com/2011/04/big-data-problems.html' ref='tag:blogger.com,1999:blog-6576681508814660796.post-3147996761905650682' source='http://www.blogger.com/feeds/6576681508814660796/posts/default/3147996761905650682' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-613311211'/></entry><entry><id>tag:blogger.com,1999:blog-6576681508814660796.post-3901684513552374227</id><published>2011-04-24T16:28:41.170-04:00</published><updated>2011-04-24T16:28:41.170-04:00</updated><title type='text'>I regularly analyse &amp;gt; 15 GB data sets using the...</title><content type='html'>I regularly analyse &amp;gt; 15 GB data sets using the standard R distribution (and am the author of the second article you reference).  You do have to think and work somewhat differently from how the standard introductions to the language works which is obviously a problem.  And of course it does depend on what you need to do - I did have problems around 100 million call records when I tried to do social network analysis the naive way [1] but I eventually found a more fruitful way of analysing that data set.&lt;br /&gt;&lt;br /&gt;Standard recommendations include the biglm, biganalytics, speedglm, and biglars packages, as well as DBI and friends.&lt;br /&gt;&lt;br /&gt;In general, and this is probably better suited for a blog post than a comment, my approach is to first work hard at the data selection and preparation to make sure I work on the right problem and then to look at algorithms that I can execute in chunks and then combine.  The latter is of course also essentially what SAS does.&lt;br /&gt;&lt;br /&gt;Allan&lt;br /&gt;&lt;br /&gt;[1] http://www.cybaea.net/Blogs/Data/SNA-with-R-Loading-large-networks-using-the-igraph-library.html</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/3901684513552374227'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/3901684513552374227'/><link rel='alternate' type='text/html' href='http://www.rcasts.com/2011/04/big-data-problems.html?showComment=1303676921170#c3901684513552374227' title=''/><author><name>Allan Engelhardt</name><uri>http://www.cybaea.net/</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img1.blogblog.com/img/blank.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.rcasts.com/2011/04/big-data-problems.html' ref='tag:blogger.com,1999:blog-6576681508814660796.post-3147996761905650682' source='http://www.blogger.com/feeds/6576681508814660796/posts/default/3147996761905650682' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-890261614'/></entry><entry><id>tag:blogger.com,1999:blog-6576681508814660796.post-8757099142189070837</id><published>2011-04-22T14:56:08.707-04:00</published><updated>2011-04-22T14:56:08.707-04:00</updated><title type='text'>Stata is also limited by the ram on your computer,...</title><content type='html'>Stata is also limited by the ram on your computer, so wouldn&amp;#39;t help in this instance.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/8757099142189070837'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/8757099142189070837'/><link rel='alternate' type='text/html' href='http://www.rcasts.com/2011/04/big-data-problems.html?showComment=1303498568707#c8757099142189070837' title=''/><author><name>Anonymous</name><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img1.blogblog.com/img/blank.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.rcasts.com/2011/04/big-data-problems.html' ref='tag:blogger.com,1999:blog-6576681508814660796.post-3147996761905650682' source='http://www.blogger.com/feeds/6576681508814660796/posts/default/3147996761905650682' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-1414446712'/></entry><entry><id>tag:blogger.com,1999:blog-6576681508814660796.post-8382630362438263289</id><published>2011-04-22T13:28:23.298-04:00</published><updated>2011-04-22T13:28:23.298-04:00</updated><title type='text'>You say SAS and MapReduce, but you can also use R ...</title><content type='html'>You say SAS and MapReduce, but you can also use R with MapReduce, in case you (or your readers) didn&amp;#39;t know.&lt;br /&gt;&lt;br /&gt;Check out RHIPE, for starters:&lt;br /&gt;&lt;br /&gt;http://www.stat.purdue.edu/~sguha/rhipe/</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/8382630362438263289'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/8382630362438263289'/><link rel='alternate' type='text/html' href='http://www.rcasts.com/2011/04/big-data-problems.html?showComment=1303493303298#c8382630362438263289' title=''/><author><name>Anonymous</name><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img1.blogblog.com/img/blank.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.rcasts.com/2011/04/big-data-problems.html' ref='tag:blogger.com,1999:blog-6576681508814660796.post-3147996761905650682' source='http://www.blogger.com/feeds/6576681508814660796/posts/default/3147996761905650682' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-1614816881'/></entry><entry><id>tag:blogger.com,1999:blog-6576681508814660796.post-2569763314706505907</id><published>2011-04-22T10:35:02.792-04:00</published><updated>2011-04-22T10:35:02.792-04:00</updated><title type='text'>SAS is good for large datasets, as it has out-of-c...</title><content type='html'>SAS is good for large datasets, as it has out-of-core algorithms. Splus can also do this. Revolution Computing&amp;#39;s version of R does this. All of these are commercial products. In the open source domain I have found Python to be great. I would also look at open source databases, such as MySQL and SQLIte, but I haven&amp;#39;t used these.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/2569763314706505907'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/2569763314706505907'/><link rel='alternate' type='text/html' href='http://www.rcasts.com/2011/04/big-data-problems.html?showComment=1303482902792#c2569763314706505907' title=''/><author><name>Blaise</name><uri>http://www.blogger.com/profile/12276667747467385669</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.rcasts.com/2011/04/big-data-problems.html' ref='tag:blogger.com,1999:blog-6576681508814660796.post-3147996761905650682' source='http://www.blogger.com/feeds/6576681508814660796/posts/default/3147996761905650682' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-1065946409'/></entry><entry><id>tag:blogger.com,1999:blog-6576681508814660796.post-6431986152602639200</id><published>2011-04-22T10:03:43.377-04:00</published><updated>2011-04-22T10:03:43.377-04:00</updated><title type='text'>Have a look at ROOT (root.cern.ch). It was created...</title><content type='html'>Have a look at ROOT (root.cern.ch). It was created for particle physics data, and we routinely analyze ntuples with &amp;gt; 1B events. You can have the data split in multiple files but then merge it for analysis. It&amp;#39;s got limitations too, but might be helpful.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/6431986152602639200'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/6431986152602639200'/><link rel='alternate' type='text/html' href='http://www.rcasts.com/2011/04/big-data-problems.html?showComment=1303481023377#c6431986152602639200' title=''/><author><name>Carla Vale</name><uri>http://www.linkedin.com/pub/carla-m-vale/4/781/4ba</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img1.blogblog.com/img/blank.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.rcasts.com/2011/04/big-data-problems.html' ref='tag:blogger.com,1999:blog-6576681508814660796.post-3147996761905650682' source='http://www.blogger.com/feeds/6576681508814660796/posts/default/3147996761905650682' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-1711075777'/></entry><entry><id>tag:blogger.com,1999:blog-6576681508814660796.post-3382437285206369251</id><published>2011-04-22T09:26:30.262-04:00</published><updated>2011-04-22T09:26:30.262-04:00</updated><title type='text'>Have you tried Python, it is also an open source l...</title><content type='html'>Have you tried Python, it is also an open source language and easy for starters. More importantly, you can access R from Python almost seamlessly with the package RPY. I met a same problem as you (although less than 100GB you are facing), and solve it with Python. I also wrote a post on it http://www.mathfinance.cn/life-is-short-use-python/.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/3382437285206369251'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/3382437285206369251'/><link rel='alternate' type='text/html' href='http://www.rcasts.com/2011/04/big-data-problems.html?showComment=1303478790262#c3382437285206369251' title=''/><author><name>Quant</name><uri>http://www.mathfinance.cn</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img1.blogblog.com/img/blank.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.rcasts.com/2011/04/big-data-problems.html' ref='tag:blogger.com,1999:blog-6576681508814660796.post-3147996761905650682' source='http://www.blogger.com/feeds/6576681508814660796/posts/default/3147996761905650682' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-33302068'/></entry><entry><id>tag:blogger.com,1999:blog-6576681508814660796.post-1242895743652396390</id><published>2011-04-22T08:42:36.673-04:00</published><updated>2011-04-22T08:42:36.673-04:00</updated><title type='text'>Do you really need the huge amount of rows  in mem...</title><content type='html'>Do you really need the huge amount of rows  in memory , perhaps chunking and the usage of list/hash elements succeed?&lt;br /&gt;&lt;br /&gt;My best friends for data-preparation with GB&amp;#39;s of data  currently  (awk,mawk,python).</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/1242895743652396390'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6576681508814660796/3147996761905650682/comments/default/1242895743652396390'/><link rel='alternate' type='text/html' href='http://www.rcasts.com/2011/04/big-data-problems.html?showComment=1303476156673#c1242895743652396390' title=''/><author><name>Christian</name><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img1.blogblog.com/img/blank.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.rcasts.com/2011/04/big-data-problems.html' ref='tag:blogger.com,1999:blog-6576681508814660796.post-3147996761905650682' source='http://www.blogger.com/feeds/6576681508814660796/posts/default/3147996761905650682' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-130128507'/></entry></feed>
