<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Using the R multicore package in Linux with wild and passionate abandon</title>
	<atom:link href="http://www.cerebralmastication.com/2010/02/using-the-r-multicore-package-in-linux-with-wild-and-passionate-abandon/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.cerebralmastication.com/2010/02/using-the-r-multicore-package-in-linux-with-wild-and-passionate-abandon/</link>
	<description>Something to Chew On</description>
	<lastBuildDate>Wed, 07 Dec 2011 13:07:56 -0500</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: JD Long</title>
		<link>http://www.cerebralmastication.com/2010/02/using-the-r-multicore-package-in-linux-with-wild-and-passionate-abandon/comment-page-1/#comment-1585</link>
		<dc:creator>JD Long</dc:creator>
		<pubDate>Thu, 29 Apr 2010 18:12:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=562#comment-1585</guid>
		<description>John, I think you want to use snow or RMPI. If you are already running MPI then RMPI is the way to go. If you just have a bunch of computers that are not currently running as a grid, I think snow is the way to go. I highly recommend also using the &quot;for each&quot; package. For each has backends for MC, snow, and RMPI so you only have to change one line of code to switch from one parallel method to another.</description>
		<content:encoded><![CDATA[<p>John, I think you want to use snow or RMPI. If you are already running MPI then RMPI is the way to go. If you just have a bunch of computers that are not currently running as a grid, I think snow is the way to go. I highly recommend also using the &#8220;for each&#8221; package. For each has backends for MC, snow, and RMPI so you only have to change one line of code to switch from one parallel method to another.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Ramey</title>
		<link>http://www.cerebralmastication.com/2010/02/using-the-r-multicore-package-in-linux-with-wild-and-passionate-abandon/comment-page-1/#comment-1584</link>
		<dc:creator>John Ramey</dc:creator>
		<pubDate>Thu, 29 Apr 2010 17:00:23 +0000</pubDate>
		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=562#comment-1584</guid>
		<description>I have several networked computers that each have multiple cores. What R package would you recommend to take advantage of not only the multiple cores but the multiple computers?

An even nicer feature that I&#039;m seeking is a job queueing system for this kind of setup; if this is not available, that is okay. I&#039;m really trying to figure out what my options are.

Please note that I prefer an easier setup over an efficient setup because many of the people that will use this system are not proficient in R.</description>
		<content:encoded><![CDATA[<p>I have several networked computers that each have multiple cores. What R package would you recommend to take advantage of not only the multiple cores but the multiple computers?</p>
<p>An even nicer feature that I&#8217;m seeking is a job queueing system for this kind of setup; if this is not available, that is okay. I&#8217;m really trying to figure out what my options are.</p>
<p>Please note that I prefer an easier setup over an efficient setup because many of the people that will use this system are not proficient in R.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Aleks Clark</title>
		<link>http://www.cerebralmastication.com/2010/02/using-the-r-multicore-package-in-linux-with-wild-and-passionate-abandon/comment-page-1/#comment-1409</link>
		<dc:creator>Aleks Clark</dc:creator>
		<pubDate>Fri, 16 Apr 2010 08:06:46 +0000</pubDate>
		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=562#comment-1409</guid>
		<description>ever tried snowfall? sfClusterApplyLB is pretty efficient at maxing out my cores, not sure how the internals work, but it gets the job done without having to mess with what kind of jobs they are.</description>
		<content:encoded><![CDATA[<p>ever tried snowfall? sfClusterApplyLB is pretty efficient at maxing out my cores, not sure how the internals work, but it gets the job done without having to mess with what kind of jobs they are.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: JD Long</title>
		<link>http://www.cerebralmastication.com/2010/02/using-the-r-multicore-package-in-linux-with-wild-and-passionate-abandon/comment-page-1/#comment-1236</link>
		<dc:creator>JD Long</dc:creator>
		<pubDate>Tue, 06 Apr 2010 20:16:30 +0000</pubDate>
		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=562#comment-1236</guid>
		<description>Ayman, thanks for posting that info! That is VERY helpful and I did not understand this before you posted it.</description>
		<content:encoded><![CDATA[<p>Ayman, thanks for posting that info! That is VERY helpful and I did not understand this before you posted it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ayman</title>
		<link>http://www.cerebralmastication.com/2010/02/using-the-r-multicore-package-in-linux-with-wild-and-passionate-abandon/comment-page-1/#comment-1127</link>
		<dc:creator>ayman</dc:creator>
		<pubDate>Mon, 29 Mar 2010 23:14:10 +0000</pubDate>
		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=562#comment-1127</guid>
		<description>The way mc.preschedule works is this.   Lets say you have 2 cores and a list of 10 things.

If its set to TRUE like: mclapply(1:10, identity, mc.preschedule=TRUE) then R spawns two children (one for each core) and roughly takes all the odd numbers and sends it to the first core and the even numbers go to the second core.  Lets say one of the even numbers (lets say #6) takes a long time to do.  Then the odd numbers finish faster, then core #1 will appear to be idle while core #2 finishes. 

If its set to FALSE like: mclapply(1:10, identity, mc.preschedule=TRUE) the R will spawn 1 child for each number.  As each child finishes, it launches the next.  So in this example as #6 takes a long time holding up a core, it can still throttle the other # jobs through the remaining core.  In the end, R will have launched as many children as was the length of the input sequence (1:10 in this case) - this takes some time to do as it has to launch a child and clean up afterwards for each thing.

So generally, if you have a lot of little fast jobs, set it to TRUE to get better performance.  If you have a lot of variance in the run time, set it to FALSE.</description>
		<content:encoded><![CDATA[<p>The way mc.preschedule works is this.   Lets say you have 2 cores and a list of 10 things.</p>
<p>If its set to TRUE like: mclapply(1:10, identity, mc.preschedule=TRUE) then R spawns two children (one for each core) and roughly takes all the odd numbers and sends it to the first core and the even numbers go to the second core.  Lets say one of the even numbers (lets say #6) takes a long time to do.  Then the odd numbers finish faster, then core #1 will appear to be idle while core #2 finishes. </p>
<p>If its set to FALSE like: mclapply(1:10, identity, mc.preschedule=TRUE) the R will spawn 1 child for each number.  As each child finishes, it launches the next.  So in this example as #6 takes a long time holding up a core, it can still throttle the other # jobs through the remaining core.  In the end, R will have launched as many children as was the length of the input sequence (1:10 in this case) &#8211; this takes some time to do as it has to launch a child and clean up afterwards for each thing.</p>
<p>So generally, if you have a lot of little fast jobs, set it to TRUE to get better performance.  If you have a lot of variance in the run time, set it to FALSE.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: userR</title>
		<link>http://www.cerebralmastication.com/2010/02/using-the-r-multicore-package-in-linux-with-wild-and-passionate-abandon/comment-page-1/#comment-1056</link>
		<dc:creator>userR</dc:creator>
		<pubDate>Tue, 23 Mar 2010 23:46:33 +0000</pubDate>
		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=562#comment-1056</guid>
		<description>Have you tried running Ubuntu&#039;s revolurion-r package in your VM? It includes a library with faster (multi-threaded) versions of many of the base functions.

I&#039;ve noticed that for certain operations, there are some substantial speedups (for my toy comparison on a 2-core Ubuntu VM, going from 8 seconds to 1.7)

On Ubuntu 9.10, just run &quot;sudo aptitude install revolution-r&quot;

My benchmark was:

 m &lt;- matrix(rnorm(2 * 10^7), ncol = 10^4)
 system.time(crossprod(m))

Good luck!</description>
		<content:encoded><![CDATA[<p>Have you tried running Ubuntu&#8217;s revolurion-r package in your VM? It includes a library with faster (multi-threaded) versions of many of the base functions.</p>
<p>I&#8217;ve noticed that for certain operations, there are some substantial speedups (for my toy comparison on a 2-core Ubuntu VM, going from 8 seconds to 1.7)</p>
<p>On Ubuntu 9.10, just run &#8220;sudo aptitude install revolution-r&#8221;</p>
<p>My benchmark was:</p>
<p> m &lt;- matrix(rnorm(2 * 10^7), ncol = 10^4)<br />
 system.time(crossprod(m))</p>
<p>Good luck!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: dude</title>
		<link>http://www.cerebralmastication.com/2010/02/using-the-r-multicore-package-in-linux-with-wild-and-passionate-abandon/comment-page-1/#comment-584</link>
		<dc:creator>dude</dc:creator>
		<pubDate>Wed, 17 Feb 2010 04:40:55 +0000</pubDate>
		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=562#comment-584</guid>
		<description>How about some seamless integration of plyr with multicore - that would be awesome!</description>
		<content:encoded><![CDATA[<p>How about some seamless integration of plyr with multicore &#8211; that would be awesome!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Cerebral Mastication &#187; Blog Archive &#187; You can Hadoop it! It&#8217;s elastic! Boogie woogie woog-ie!</title>
		<link>http://www.cerebralmastication.com/2010/02/using-the-r-multicore-package-in-linux-with-wild-and-passionate-abandon/comment-page-1/#comment-575</link>
		<dc:creator>Cerebral Mastication &#187; Blog Archive &#187; You can Hadoop it! It&#8217;s elastic! Boogie woogie woog-ie!</dc:creator>
		<pubDate>Tue, 16 Feb 2010 18:31:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=562#comment-575</guid>
		<description>[...] to visit with Madam Wu, you ask? Well the short answer is Hadoop. Yeah, the cute little elephant. As I have told you before, multicore makes your R code run fast by using worm holes to shoot your results back from the [...]</description>
		<content:encoded><![CDATA[<p>[...] to visit with Madam Wu, you ask? Well the short answer is Hadoop. Yeah, the cute little elephant. As I have told you before, multicore makes your R code run fast by using worm holes to shoot your results back from the [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: JD Long</title>
		<link>http://www.cerebralmastication.com/2010/02/using-the-r-multicore-package-in-linux-with-wild-and-passionate-abandon/comment-page-1/#comment-532</link>
		<dc:creator>JD Long</dc:creator>
		<pubDate>Fri, 12 Feb 2010 16:36:52 +0000</pubDate>
		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=562#comment-532</guid>
		<description>Josh:

Yeah I profiled and know exactly what&#039;s time consuming. Unfortunately it&#039;s the sauce ;) 

I do all my random draws prior to starting the runs, so the QRMlib process doesn&#039;t hold things up. In my sims I take my random draw and then do some curve fitting to resolve a number of relationships then it has to calculate gains/losses for 40K+ policy units. It&#039;s basically the valuation of the policies that takes the longest. 

Probably a reasonable analogy would be valuing 40K+ options in 40 markets.</description>
		<content:encoded><![CDATA[<p>Josh:</p>
<p>Yeah I profiled and know exactly what&#8217;s time consuming. Unfortunately it&#8217;s the sauce <img src='http://www.cerebralmastication.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />  </p>
<p>I do all my random draws prior to starting the runs, so the QRMlib process doesn&#8217;t hold things up. In my sims I take my random draw and then do some curve fitting to resolve a number of relationships then it has to calculate gains/losses for 40K+ policy units. It&#8217;s basically the valuation of the policies that takes the longest. </p>
<p>Probably a reasonable analogy would be valuing 40K+ options in 40 markets.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Joshua Ulrich</title>
		<link>http://www.cerebralmastication.com/2010/02/using-the-r-multicore-package-in-linux-with-wild-and-passionate-abandon/comment-page-1/#comment-528</link>
		<dc:creator>Joshua Ulrich</dc:creator>
		<pubDate>Fri, 12 Feb 2010 04:21:04 +0000</pubDate>
		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=562#comment-528</guid>
		<description>JD,

Have you profiled your code to determine the bottleneck?  Is it something in QRMlib or does your secret sauce need some spice?

Best,
Josh</description>
		<content:encoded><![CDATA[<p>JD,</p>
<p>Have you profiled your code to determine the bottleneck?  Is it something in QRMlib or does your secret sauce need some spice?</p>
<p>Best,<br />
Josh</p>
]]></content:encoded>
	</item>
</channel>
</rss>

