<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Cerebral Mastication &#187; ec2</title>
	<atom:link href="http://www.cerebralmastication.com/tag/ec2/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.cerebralmastication.com</link>
	<description>Something to Chew On</description>
	<lastBuildDate>Wed, 07 Dec 2011 13:08:46 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Shell scripting EC2 for fun and profit</title>
		<link>http://www.cerebralmastication.com/2011/05/shell-scripting-ec2-for-fun-and-profit/</link>
		<comments>http://www.cerebralmastication.com/2011/05/shell-scripting-ec2-for-fun-and-profit/#comments</comments>
		<pubDate>Fri, 06 May 2011 20:57:40 +0000</pubDate>
		<dc:creator>JD Long</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[aws]]></category>
		<category><![CDATA[ec2]]></category>
		<category><![CDATA[howto]]></category>
		<category><![CDATA[R]]></category>

		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=993</guid>
		<description><![CDATA[Lately I&#8217;ve been doing some work with creating ad-hoc clusters of EC2 machines. My ultimate goal is to create a simple way to spin up a cluster of EC2 machines for use with Bryan Lewis&#8217;s very cool doRedis backend for the R foreach package. But that&#8217;s a whole other post. What I was scratching my [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.thinkgeek.com/tshirts-apparel/unisex/frustrations/374d/" onclick="pageTracker._trackPageview('/outgoing/www.thinkgeek.com/tshirts-apparel/unisex/frustrations/374d/?referer=');"><img class="alignleft size-full wp-image-994" style="border: 1px solid black; margin: 2px;" title="lg-go-away-tshirt" src="http://www.cerebralmastication.com/wp-content/uploads/2011/05/lg-go-away-tshirt.jpg" alt="" width="179" height="218" /></a>Lately I&#8217;ve been doing some work with creating ad-hoc clusters of EC2 machines. My ultimate goal is to create a simple way to spin up a cluster of EC2 machines for use with Bryan Lewis&#8217;s very cool <a href="http://cran.r-project.org/web/packages/doRedis/index.html" onclick="pageTracker._trackPageview('/outgoing/cran.r-project.org/web/packages/doRedis/index.html?referer=');">doRedis backend</a> for the R <a href="http://cran.r-project.org/web/packages/foreach/index.html" onclick="pageTracker._trackPageview('/outgoing/cran.r-project.org/web/packages/foreach/index.html?referer=');">foreach package</a>. But that&#8217;s a whole other post. What I was scratching my head about today was that I&#8217;d really just like to, with a single command, spin up an EC2 instance, wait for it to come up, and then ssh into it. I do this iteration about 20 times a day when I&#8217;m testing things, so it seemed to make sense to shell script it.<br />
To do this, one needs the EC2 command line tools installed on your workstation. In Ubuntu that&#8217;s as easy as `sudo apt-get ec2-api-tools`</p>
<p>So here&#8217;s a short shell script to spin up an instance, wait 30 seconds, then connect:<br />
<script src="http://gist.github.com/959780.js"></script></p>
<p>If you&#8217;re reading this through an RSS reader, you can see the script over at <a href="https://gist.github.com/959780" onclick="pageTracker._trackPageview('/outgoing/gist.github.com/959780?referer=');">github</a>.</p>
<p>Obviously you&#8217;ll need to change the parameters at the top of the script to suit your needs. But since this was a bit of a pain in the donkey hole for me to figure out, I thought I would share.</p>
<p>If you want to help out, I&#8217;d love you to enlighten me on how to have the script figure out if an instance has finished booting so I could eliminate the sleep step.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cerebralmastication.com/2011/05/shell-scripting-ec2-for-fun-and-profit/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Details of two-way sync between two Ubuntu machines</title>
		<link>http://www.cerebralmastication.com/2011/04/details-of-two-way-sync-between-two-ubuntu-machines/</link>
		<comments>http://www.cerebralmastication.com/2011/04/details-of-two-way-sync-between-two-ubuntu-machines/#comments</comments>
		<pubDate>Mon, 18 Apr 2011 20:48:32 +0000</pubDate>
		<dc:creator>JD Long</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ec2]]></category>
		<category><![CDATA[howto]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[workflow]]></category>

		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=966</guid>
		<description><![CDATA[In a previous post I discussed my frustrations with trying to get Dropbox or Spideroak to perform BOTH encrypted remote backup and AND fast two way file syncing. This is the detail of how I set up for two machines, both Ubuntu 10.10, to perform two way sync where a file change on either machine [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.cerebralmastication.com/wp-content/uploads/2011/04/SyncDifferent.png"><img class="alignleft size-full wp-image-956" title="sync" src="http://www.cerebralmastication.com/wp-content/uploads/2011/04/SyncDifferent.png" alt="" width="128" height="128" /></a>In a <a href="http://www.cerebralmastication.com/2011/04/fast-two-way-sync-in-ubuntu/">previous post</a> I discussed my frustrations with trying to get Dropbox or Spideroak to perform BOTH encrypted remote backup and AND fast two way file syncing. This is the detail of how I set up for two machines, both Ubuntu 10.10, to perform two way sync where a file change on either machine will result in that change being replicated on the other machine.</p>
<p>I initially tried running Unison on BOTH my laptop and the server and had the server Unison set to sync with my laptop back through an SSH reverse proxy. After testing this for a while I discovered this is totally the wrong way to do it. The problem is that the Unison process makes temp directories and files in the file system of the target. So my Unison job on the laptop would be trying to syn files and, in the process, create temp files which would kick off a Unison sync on the sever which would make temp files on the laptop&#8230; I think you can see how convoluted this gets.</p>
<p>So a much better solution is to only run Unison from one machine (I chose my laptop) and have the other machine (server in my case) send an SSH command (over the aforementioned reverse proxy) to the laptop asking the laptop to kick off a Unison sync. This way all of the syncs happen from the laptop.</p>
<p>So, in short, both machines run lsyncd which monitors files for changes. I keep up an SSH tunnel with reverse port forwarding which forwards a remote machine port back to my laptop&#8217;s port 22 (SSH). Unison need be installed ONLY on my laptop. When a change happens on my laptop, lsyncd fires off a Unison sync from my laptop that syncs it with the server. When a file changes on the server, the lsyncd job on the server makes a connection to my laptop via ssh and fires off a Unsion sync between my laptop and the server.</p>
<p>Here&#8217;s an example of my lsyncd config scripts:</p>
<p><strong>Laptop:</strong></p>
<blockquote><p>settings = {<br />
logfile    = &#8220;/home/jal/lsyncd/laptop/lsyncd.log&#8221;,<br />
statusFile = &#8220;/home/jal/lsyncd/laptop/lsyncd.status&#8221;,<br />
maxDelays  = 15,<br />
&#8211;nodaemon   = true,<br />
}</p>
<p>runUnison2 = {<br />
maxProcesses = 1,<br />
delay = 15,<br />
onAttrib  = &#8220;/usr/bin/unison -batch /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
onCreate  = &#8220;/usr/bin/unison -batch /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
onDelete  = &#8220;/usr/bin/unison -batch /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
onModify  = &#8220;/usr/bin/unison -batch /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
onMove    = &#8220;/usr/bin/unison -batch /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
}</p>
<p>sync{runUnison2, source=&#8221;/home/jal/Documents&#8221;}</p></blockquote>
<p><strong>Server:</strong></p>
<blockquote><p>settings = {<br />
logfile    = &#8220;/home/jal/lsyncd/server/lsyncd.log&#8221;,<br />
statusFile = &#8220;/home/jal/lsyncd/server/lsyncd.status&#8221;,<br />
maxDelays  = 15,<br />
&#8211;nodaemon   = true,<br />
}</p>
<p>runUnison2 = {<br />
maxProcesses = 1,<br />
delay = 15,<br />
onAttrib  = &#8220;ssh localhost -p 5432 unison -batch  /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
onCreate  = &#8220;ssh localhost -p 5432 unison -batch  /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
onDelete  = &#8220;ssh localhost -p 5432 unison -batch  /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
onModify  = &#8220;ssh localhost -p 5432 unison -batch  /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
onMove    = &#8220;ssh localhost -p 5432 unison -batch  /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
}</p>
<p>sync{runUnison2, source=&#8221;/home/jal/Documents&#8221;}</p></blockquote>
<p>Keep in mind that I am using version 2 of lsyncd which can be downloaded here: <a href="http://code.google.com/p/lsyncd/" onclick="pageTracker._trackPageview('/outgoing/code.google.com/p/lsyncd/?referer=');">http://code.google.com/p/lsyncd/</a></p>
<p>The version of lsyncd available in the Ubuntu repo is version 1.x which does not use the same config format as I illustrate above. However, if you run into dependency issues with v2, the easiest thing to do is install the repo version which will install dependencies and then manually download and install v2 from the above URL.</p>
<p>My reverse port forwarding set up looks like this:</p>
<blockquote><p>autossh -2 -4 -X -R 5432:localhost:22 12.34.56.78</p></blockquote>
<p>the -R bit forwards remote port 5432 to my laptop&#8217;s port 22 which is the ssh. So on my server if I run ssh localhost -p 5432 what actually happens is I am sshing from the remote machine to my laptop.</p>
<p><strong>Notes:</strong></p>
<ul>
<li>The IP address of my server in this example is 12.34.56.78.</li>
<li>Don&#8217;t try and sync the directories where the lsyncd logs are kept. That will results in an endless sync cycle as each machine keeps noticing changes endlessly. Don&#8217;t ask me how I know this.</li>
<li>The command to start the sync on the laptop is &#8220;lsyncd /home/jal/lsyncd/laptop/configfile&#8221; where configfile is the above lsyncd configuration file.</li>
<li>lsyncd could, conceivably, tell Unison to sync only the part of the directory tree that changed. I have not been able to make that feature work right, however. And it only takes Unison a few seconds to sync, so I&#8217;ve not worried about it.</li>
</ul>
<p>This has greatly sped up my <a href="http://rstudio.org" onclick="pageTracker._trackPageview('/outgoing/rstudio.org?referer=');">RStudio</a> based workflow when doing analysis with R. Now when I change files on my server using RStudio they are immediately (well it waits 15 seconds) replicated to my local machine and vice versa!</p>
<p>Good luck and if you have any suggestions please post a comment!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cerebralmastication.com/2011/04/details-of-two-way-sync-between-two-ubuntu-machines/feed/</wfw:commentRss>
		<slash:comments>30</slash:comments>
		</item>
		<item>
		<title>Starting an EC2 Machine Then Setting Up a Socks Proxy&#8230; From R!</title>
		<link>http://www.cerebralmastication.com/2010/07/starting-an-ec2-machine-then-setting-up-a-socks-proxy-from-r/</link>
		<comments>http://www.cerebralmastication.com/2010/07/starting-an-ec2-machine-then-setting-up-a-socks-proxy-from-r/#comments</comments>
		<pubDate>Fri, 16 Jul 2010 22:07:12 +0000</pubDate>
		<dc:creator>JD Long</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[aws]]></category>
		<category><![CDATA[ec2]]></category>
		<category><![CDATA[proxy]]></category>
		<category><![CDATA[R]]></category>

		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=748</guid>
		<description><![CDATA[I do some work from home, some work from an office in Chicago and some work on the road. It&#8217;s not uncommon for me to want to tunnel all my web traffic through a VPN tunnel. In one of my previous blog posts I alluded to using Amazon EC2 as a way to get around [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.cerebralmastication.com/wp-content/uploads/2010/07/firewallkat.jpg"><img class="alignleft size-full wp-image-765" title="firewallkat" src="http://www.cerebralmastication.com/wp-content/uploads/2010/07/firewallkat.jpg" alt="" width="361" height="312" /></a>I do some work from home, some work from an office in Chicago and some work on the road. It&#8217;s not uncommon for me to want to tunnel all my web traffic through a VPN tunnel. In one of my previous blog posts I <a href="http://www.cerebralmastication.com/2009/11/using-amazon-ec2-to-thwart-crappy-internal-it-services/">alluded to using Amazon EC2 as a way to get around your corporate IT</a> <span style="text-decoration: line-through;">mind control voyeurs</span> service providers. This tunneling method is one of the 5 or so ways I have used EC2 to set up a tunnel. I used to fire these tunnels up manually using the <a href="https://console.aws.amazon.com" onclick="pageTracker._trackPageview('/outgoing/console.aws.amazon.com?referer=');">Amazon AWS Management Console</a> then opening a shell prompt and entering:</p>
<blockquote>
<pre>ssh -i ~/MyPersonalKey.pem -D 9999 root@ec2-184-73-41-72.compute-1.amazonaws.com</pre>
</blockquote>
<p>the -i switch tells ssh to use my RSA identity file stored in ~/MyPersonalKey.pem</p>
<p>the machine name (ec2-184-73-41-72.compute-1.amazonaws.com) I get from the AWS Management Console</p>
<p>the -D is the magic. -D opens an dynamic port forwarding tunnel between my Linux box and the EC2 machine. This is, for all intent and purposes, an encrypted SOCKS4 proxy on port 9999 of localhost. Then I just have to change my proxy settings in Firefox to use use a SOCKS host.</p>
<p>Now that&#8217;s all pretty easy. And I like easy. But it&#8217;s not easy ENOUGH. You see, I&#8217;m lazy. I&#8217;m not just lazy in the &#8220;I&#8217;ll do it mañana&#8221; sort of way, but in the &#8220;I&#8217;m too damn lazy to click my mouse 5 times&#8221; way.</p>
<p>So I want this easier. Well, I can make the proxy settings in Firefox easier through the use of the <a href="https://addons.mozilla.org/en-US/firefox/addon/1557/" onclick="pageTracker._trackPageview('/outgoing/addons.mozilla.org/en-US/firefox/addon/1557/?referer=');">Quick Proxy extension for Firefox</a>. That&#8217;s a good start. It turns on and off the proxy with a single mouse click. But I still have to go into the AWS management web site, fire up a machine then log in via SSH. Let&#8217;s make that part easier!</p>
<p>While it&#8217;s not simple to install and configure, the EC2 command line tools are going to be required in order to make a script that fires up EC2 and then connects to the instance with ssh. I struggled getting the tools to run until I found <a href="http://linuxsysadminblog.com/2009/06/howto-get-started-with-amazon-ec2-api-tools/" onclick="pageTracker._trackPageview('/outgoing/linuxsysadminblog.com/2009/06/howto-get-started-with-amazon-ec2-api-tools/?referer=');">this tutorial</a>.</p>
<p>Your file locations and names may be different than the tutorial. Change appropriately. I followed the tutorial instructions but I created a key named ec2ApiTools which will come in handy later.</p>
<p>After you get the EC2 tool up and running and you can do something like list the available AMIs without an error you can stop with the tutorial. I&#8217;ve been doing a lot of shell scripting lately so I said to myself, &#8220;Self, let&#8217;s script the ssh connection in R!&#8221; For the record, I always end my impredicative in an explanation point which I verbally pronounce as, &#8220;BANG!&#8221; As a result, when I talk to myself it sounds like two 10 year old boys playing cops and robbers. Anyhow, I did script it with R using Rscript. Because I&#8217;m a man who listens to myself.</p>
<p>And since you were kind enough to slog through my channeling the drunken ghost of James Joyce, here&#8217;s my script:</p>
<script src="http://gist.github.com/478930.js"></script>
<p>If you&#8217;re reading this in an RSS reader of for some other reason don&#8217;t see an R script above, <a href="http://gist.github.com/478930#file_start_ec2_instance_ssh.r" onclick="pageTracker._trackPageview('/outgoing/gist.github.com/478930_file_start_ec2_instance_ssh.r?referer=');">here&#8217;s your link</a>.</p>
<p>The only two EC2 API commands I use in the script are  <em>ec2-run-instances</em> which starts the instance and <em>ec2-describe-instances</em> which gives me a list of running instances and their details.The rest of the script is simply parsing the output and figuring out which instances was started last.</p>
<p>I&#8217;ve now set up a launcher panel item that starts the script. Then when I see the xterm window come up I click the little red button in the lower right corner of my browser which switches on the Firefox proxy. Then I&#8217;m safe to surf <a href="http://www.sofmag.com/" onclick="pageTracker._trackPageview('/outgoing/www.sofmag.com/?referer=');">Soldier of Fortune Magazine</a> without the interference of my corp firewall.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cerebralmastication.com/2010/07/starting-an-ec2-machine-then-setting-up-a-socks-proxy-from-r/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Using the R multicore package in Linux with wild and passionate abandon</title>
		<link>http://www.cerebralmastication.com/2010/02/using-the-r-multicore-package-in-linux-with-wild-and-passionate-abandon/</link>
		<comments>http://www.cerebralmastication.com/2010/02/using-the-r-multicore-package-in-linux-with-wild-and-passionate-abandon/#comments</comments>
		<pubDate>Tue, 09 Feb 2010 19:57:20 +0000</pubDate>
		<dc:creator>JD Long</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ec2]]></category>
		<category><![CDATA[howto]]></category>
		<category><![CDATA[R]]></category>

		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=562</guid>
		<description><![CDATA[One of my primary uses for R is to build stochastic simulations of insurance portfolios and reinsurance treaties. It&#8217;s not uncommon for each of my simulations to take 20 seconds or more to complete (if you&#8217;re doing the math, that&#8217;s 55 hours for 10K sims or, approximately 453 games of solitaire) . Initially I ran [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.cerebralmastication.com/wp-content/uploads/2010/02/amd_mc_processing.jpg"><img class="alignleft size-full wp-image-586" style="border: 0pt none; margin: 20px;" title="amd_mc_processing" src="http://www.cerebralmastication.com/wp-content/uploads/2010/02/amd_mc_processing.jpg" alt="" width="214" height="193" /></a>One of my primary uses for R is to build stochastic simulations of insurance portfolios and reinsurance treaties. It&#8217;s not uncommon for each of my simulations to take 20 seconds or more to complete (if you&#8217;re doing the math, that&#8217;s 55 hours for 10K sims or, approximately 453 games of solitaire) . Initially I ran my sims in R running on an <a href="http://www.virtualbox.org/" onclick="pageTracker._trackPageview('/outgoing/www.virtualbox.org/?referer=');">Oracle VirtualBox </a>(Oracle now owns Virtualbox! *gasp* ) running Ubuntu. Lately I&#8217;ve moved to running my sims on EC2 machines. I&#8217;m not yet doing RMPI clustering, although that is on my roadmap. Currently I just fire up a couple of 8 core instances and run 5K sims on each one then FTP the results back to my desktop. It&#8217;s not very sexy, but it gets the job done&#8230; I guess the same could be said of myself, except substitute &#8220;makes slurping sounds eating udon&#8221; in the place of &#8220;gets the job done.&#8221;</p>
<p>When running processor intensive crap (that&#8217;s a stochastic modeling term) the single threaded nature of R is painful. In Linux or Mac (i.e. NOT Windows) the <a href="http://www.rforge.net/doc/packages/multicore/multicore.html" onclick="pageTracker._trackPageview('/outgoing/www.rforge.net/doc/packages/multicore/multicore.html?referer=');">multicore package </a>is a real godsend. I did a quick code review and, from what I can tell, multicore exploits worm holes to travel back in time and reports your results in a fraction of the time you would expect it to take. Seriously. I expect that as the code matures my computer will fill up with simulation results from simulations which I have not even coded yet. It&#8217;s almost like magic, except without the rabbit and hat.</p>
<p>The crux of the package is a parallel-ized version of lapply() called mclapply(). I believe the mc stands for &#8216;magic carpet&#8217; and is an allusion to the worm hole technology. So how does one harness this package for <span style="text-decoration: line-through;">nefarious self interest </span>doing parallel operations in R? The ultra short answer is: write your R code so that the most processor intensive bit is done with an lapply() function. Then replace the lapply() with mclapply().  Of course you have to load the multicore package before you run it. But that&#8217;s basically it.</p>
<p>How I implement mcapply() is thusly: I build a table with all my random draws for my simulations. So if I have 20 variables and want to run 10,000 simulations then I&#8217;ll build a data frame with all 200,000 values (generally 10K rows and 21 columns for 20 variables + and index). The index keeps track of the draw number. Then I have code that performs the &#8216;valuation&#8217; based on a single observation of the 20 variables. I wrap the valuation step in a function and then call the valuation process 10,000 times with mclapply(). So it might look something like this:</p>
<blockquote><p>myOutput &lt;- mclapply( drawList, function(x) valuationReturns(drawNumber=x))</p></blockquote>
<p>The drawList object is simply a list of the possible indexes (i.e. 1:10000). When the code has iterated over each value from drawList the results will be in the myOutput object. Tada!</p>
<p>I recommend the <a href="http://htop.sourceforge.net/" onclick="pageTracker._trackPageview('/outgoing/htop.sourceforge.net/?referer=');">htop program </a>for tracking what&#8217;s going on with processor utilization in Linux (I presume Mac too if you ask Steve Jobs nicely). If everything is cranking well, and you have 8 cores, you might see an image that looks something like this:</p>
<p><a href="http://www.cerebralmastication.com/wp-content/uploads/2010/02/r-on-ec21.png"><img class="size-full wp-image-564 alignnone" title="r on ec2" src="http://www.cerebralmastication.com/wp-content/uploads/2010/02/r-on-ec21.png" alt="" width="535" height="400" /></a></p>
<p>I don&#8217;t understand time travel, but I&#8217;ve found that I have better luck if I set mc.preschedule=FALSE. Apparently prescheduled magic carpets are finicky. If I leave mc.preschedule to the default of TRUE then I find that often some of my cores go underutilized.</p>
<p>Let me know if you have other multicore tips and tricks.</p>
<p>If you want to give me shit for running my simulations as root, feel free. I&#8217;m impervious to your &#8220;best practices&#8221; mumbo jumbo. La la la la la la!! Not listening!</p>
<p>Special thanks to <a href="http://www.cis.udel.edu/~cavazos/index.php?page=multicore-programming" onclick="pageTracker._trackPageview('/outgoing/www.cis.udel.edu/_cavazos/index.php?page=multicore-programming&amp;referer=');">John Cavazos over at the University of Delaware</a> from whom I stole the MC for Dummies image. John, your a gentleman and a humble scholar. Damn few of us left.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cerebralmastication.com/2010/02/using-the-r-multicore-package-in-linux-with-wild-and-passionate-abandon/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>Using Amazon EC2 to Thwart Crappy Internal IT Services</title>
		<link>http://www.cerebralmastication.com/2009/11/using-amazon-ec2-to-thwart-crappy-internal-it-services/</link>
		<comments>http://www.cerebralmastication.com/2009/11/using-amazon-ec2-to-thwart-crappy-internal-it-services/#comments</comments>
		<pubDate>Tue, 03 Nov 2009 15:28:26 +0000</pubDate>
		<dc:creator>JD Long</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ec2]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[rant]]></category>

		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=391</guid>
		<description><![CDATA[
The alternative title of this blog post is &#8220;How to get your sorry ass fired by violating your internal IT policies.&#8221; So keep that in mind as you read this.
I say lots of silly crap. Twitter allows me the pleasure of sharing this blather with the world. I was a little surprised that of all [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://twitter.com/CMastication/status/5294564298" onclick="pageTracker._trackPageview('/outgoing/twitter.com/CMastication/status/5294564298?referer=');"><img class="alignleft size-full wp-image-393" style="margin: 6px;" title="ec2 tweet" src="http://www.cerebralmastication.com/wp-content/uploads/2009/11/ec2-tweet.PNG" alt="ec2 tweet" width="417" height="233" /></a></p>
<p>The alternative title of this blog post is &#8220;How to get your sorry ass fired by violating your internal IT policies.&#8221; So keep that in mind as you read this.</p>
<p>I say lots of silly crap. Twitter allows me the pleasure of sharing this blather with the world. I was a little surprised that of all the things I have said over the last few months the above Tweet received the most discussion. Apparently this tweet captured the imagination and consternation of some fellow Tweeters. I had people follow up with me and basically ask, &#8220;what do you mean?&#8221; Twitter is good for a sound bite, but less so for an elaborate answer. Which brings us to this:</p>
<p>What are the top ways Amazon EC2 can allow a business user to escape the manipulative and counterproductive grip of corporate IT? Well I&#8217;m glad you asked!</p>
<p><strong>1) Over-restrictive web filtering policies</strong>:  When I worked as a risk manager for a Fortune 500 insurance firm I was shocked on the first day when I could not search Google Groups. At the time Google Groups was one of my favorite resources for figuring out everything from SQL syntax to Excel formulas. The firm, like most firms, outsourced the filtering of web content. Apparently they signed up for &#8220;Super Freaking Restrictive&#8221; filtering. I could not even search the web for &#8220;Ubuntu&#8221; as all sites with the word Ubuntu in the title or with the world &#8220;Ubuntu&#8221; passed as a form submission were blocked. Apparently Ubuntu is not just a Linux distro, but also a militant organization of African computer programmers, or something. So how did I get around this with EC2? I would fire up an EC2 Ubuntu instance running Squid proxy before I left home, then ssh into the cloud from work and use a little SSH port forwarding to route my web traffic through the ssh connection and out via Squid. I set up my EC2 instance to listen for ssh on port 443 and my firm&#8217;s firewall would let the connection pass as it assumed it was simply ssl traffic into Amazon. Brilliant!</p>
<p><strong>2) Under powered database servers: </strong>At another point I was responsible for data analytics on a portfolio of insurance policies. I had to join together data from multiple systems (underwriting, admin, claims, etc.). The firm was an Oracle shop and none of the Oracle machines had enough user space for me to make the big ass join that had to be made in order to cobble together my analytics. For a while I hobbled along using PROC SQL in SAS to bring all the data together inside of SAS running on a PC. Finally I just gave up and built my own data mart in the cloud. And I could totally cut my internal IT politics out of the system. Whew, once the politics and begging for resources was over I could kick ass at analytics without having to beg borrow and plead for permissions and space.</p>
<p><strong>3) Failure to backup desktop machines / inadequate shared drive space: </strong>Another experience I had was with a firm that decided it was a good policy to NOT back up desktop PCs at all. Each department was given shared drive space on a central server where &#8220;business critical&#8221; files were supposed to be kept (whatever the hell that means). Only the files on the central server were backed up. I was in the risk management department (ironically) and we had a whopping 100 MB allocated to us. Yes, this was 2004 and 100 MB was not enough to hold 2 years of risk reviews. Not to mention any ad hoc analysis and all the supporting documents. So everyone had their desktop drives, at least one USB drive, and no off site backup. It was during this period that I discovered <a href="http://www.jungledisk.com/" onclick="pageTracker._trackPageview('/outgoing/www.jungledisk.com/?referer=');">Jungle Disk </a>which allows client side encrypted data to be backed up to Amazon! Off site backup problem solved! And, once again, corp IT cut out of the system. (yes, this is a use of S3, not EC2) By the way, I paid for backups out of my own pocket because I felt it was very important. Well, I did have the firm buy me books which I happily kept when I left. We&#8217;ll call it even.</p>
<p>Let me reiterate that all three of the above uses <span style="text-decoration: line-through;">may have</span> <span style="color: #000000;">put me in direct violation of my corporate IT policies. And let me also state that ultimately I found a job at a firm where internal IT sees their job as helping the business units get crap done. If you are an IT professional and you find your self thinking, &#8220;damn, I have to make sure I restrict my users from all of these crafty uses of EC2&#8243; then, <strong><span style="color: #993300;">jackass,you are the problem with your firm&#8217;s IT department</span></strong>. If you see your job as stopping users then you are a useless burden on your firm and you should be not only fired, but spat upon. The way to prevent users from doing these, and other &#8220;shadow IT&#8221; behaviors is to <strong><span style="color: #993300;">provide the IT services that help your users be awesom<span style="color: #993300;">e</span></span><span style="color: #993300;">!</span></strong> If you do that then you don&#8217;t have to worry about what your users are up to. They&#8217;ll be too damn busy being awesome to have time to mess with Amazon EC2.</span></p>
<p>All the examples above took place at previous places of employment. I currently use Amazon EC2 in order to scale some of my analytics, but it is done with the knowledge and support of my internal IT team. They fully understand what I am doing and they want to help me be awesome at analysis. It&#8217;s amazing how much less time I am wasting these days now that I don&#8217;t have to be so creative about avoiding the manipulative and counterproductive intervention of my internal IT team.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cerebralmastication.com/2009/11/using-amazon-ec2-to-thwart-crappy-internal-it-services/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
	</channel>
</rss>

