<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Cerebral Mastication &#187; workflow</title>
	<atom:link href="http://www.cerebralmastication.com/tag/workflow/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.cerebralmastication.com</link>
	<description>Something to Chew On</description>
	<lastBuildDate>Wed, 07 Dec 2011 13:08:46 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Details of two-way sync between two Ubuntu machines</title>
		<link>http://www.cerebralmastication.com/2011/04/details-of-two-way-sync-between-two-ubuntu-machines/</link>
		<comments>http://www.cerebralmastication.com/2011/04/details-of-two-way-sync-between-two-ubuntu-machines/#comments</comments>
		<pubDate>Mon, 18 Apr 2011 20:48:32 +0000</pubDate>
		<dc:creator>JD Long</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ec2]]></category>
		<category><![CDATA[howto]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[workflow]]></category>

		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=966</guid>
		<description><![CDATA[In a previous post I discussed my frustrations with trying to get Dropbox or Spideroak to perform BOTH encrypted remote backup and AND fast two way file syncing. This is the detail of how I set up for two machines, both Ubuntu 10.10, to perform two way sync where a file change on either machine [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.cerebralmastication.com/wp-content/uploads/2011/04/SyncDifferent.png"><img class="alignleft size-full wp-image-956" title="sync" src="http://www.cerebralmastication.com/wp-content/uploads/2011/04/SyncDifferent.png" alt="" width="128" height="128" /></a>In a <a href="http://www.cerebralmastication.com/2011/04/fast-two-way-sync-in-ubuntu/">previous post</a> I discussed my frustrations with trying to get Dropbox or Spideroak to perform BOTH encrypted remote backup and AND fast two way file syncing. This is the detail of how I set up for two machines, both Ubuntu 10.10, to perform two way sync where a file change on either machine will result in that change being replicated on the other machine.</p>
<p>I initially tried running Unison on BOTH my laptop and the server and had the server Unison set to sync with my laptop back through an SSH reverse proxy. After testing this for a while I discovered this is totally the wrong way to do it. The problem is that the Unison process makes temp directories and files in the file system of the target. So my Unison job on the laptop would be trying to syn files and, in the process, create temp files which would kick off a Unison sync on the sever which would make temp files on the laptop&#8230; I think you can see how convoluted this gets.</p>
<p>So a much better solution is to only run Unison from one machine (I chose my laptop) and have the other machine (server in my case) send an SSH command (over the aforementioned reverse proxy) to the laptop asking the laptop to kick off a Unison sync. This way all of the syncs happen from the laptop.</p>
<p>So, in short, both machines run lsyncd which monitors files for changes. I keep up an SSH tunnel with reverse port forwarding which forwards a remote machine port back to my laptop&#8217;s port 22 (SSH). Unison need be installed ONLY on my laptop. When a change happens on my laptop, lsyncd fires off a Unison sync from my laptop that syncs it with the server. When a file changes on the server, the lsyncd job on the server makes a connection to my laptop via ssh and fires off a Unsion sync between my laptop and the server.</p>
<p>Here&#8217;s an example of my lsyncd config scripts:</p>
<p><strong>Laptop:</strong></p>
<blockquote><p>settings = {<br />
logfile    = &#8220;/home/jal/lsyncd/laptop/lsyncd.log&#8221;,<br />
statusFile = &#8220;/home/jal/lsyncd/laptop/lsyncd.status&#8221;,<br />
maxDelays  = 15,<br />
&#8211;nodaemon   = true,<br />
}</p>
<p>runUnison2 = {<br />
maxProcesses = 1,<br />
delay = 15,<br />
onAttrib  = &#8220;/usr/bin/unison -batch /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
onCreate  = &#8220;/usr/bin/unison -batch /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
onDelete  = &#8220;/usr/bin/unison -batch /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
onModify  = &#8220;/usr/bin/unison -batch /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
onMove    = &#8220;/usr/bin/unison -batch /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
}</p>
<p>sync{runUnison2, source=&#8221;/home/jal/Documents&#8221;}</p></blockquote>
<p><strong>Server:</strong></p>
<blockquote><p>settings = {<br />
logfile    = &#8220;/home/jal/lsyncd/server/lsyncd.log&#8221;,<br />
statusFile = &#8220;/home/jal/lsyncd/server/lsyncd.status&#8221;,<br />
maxDelays  = 15,<br />
&#8211;nodaemon   = true,<br />
}</p>
<p>runUnison2 = {<br />
maxProcesses = 1,<br />
delay = 15,<br />
onAttrib  = &#8220;ssh localhost -p 5432 unison -batch  /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
onCreate  = &#8220;ssh localhost -p 5432 unison -batch  /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
onDelete  = &#8220;ssh localhost -p 5432 unison -batch  /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
onModify  = &#8220;ssh localhost -p 5432 unison -batch  /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
onMove    = &#8220;ssh localhost -p 5432 unison -batch  /home/jal/Documents ssh://12.34.56.78//home/jal/Documents&#8221;,<br />
}</p>
<p>sync{runUnison2, source=&#8221;/home/jal/Documents&#8221;}</p></blockquote>
<p>Keep in mind that I am using version 2 of lsyncd which can be downloaded here: <a href="http://code.google.com/p/lsyncd/" onclick="pageTracker._trackPageview('/outgoing/code.google.com/p/lsyncd/?referer=');">http://code.google.com/p/lsyncd/</a></p>
<p>The version of lsyncd available in the Ubuntu repo is version 1.x which does not use the same config format as I illustrate above. However, if you run into dependency issues with v2, the easiest thing to do is install the repo version which will install dependencies and then manually download and install v2 from the above URL.</p>
<p>My reverse port forwarding set up looks like this:</p>
<blockquote><p>autossh -2 -4 -X -R 5432:localhost:22 12.34.56.78</p></blockquote>
<p>the -R bit forwards remote port 5432 to my laptop&#8217;s port 22 which is the ssh. So on my server if I run ssh localhost -p 5432 what actually happens is I am sshing from the remote machine to my laptop.</p>
<p><strong>Notes:</strong></p>
<ul>
<li>The IP address of my server in this example is 12.34.56.78.</li>
<li>Don&#8217;t try and sync the directories where the lsyncd logs are kept. That will results in an endless sync cycle as each machine keeps noticing changes endlessly. Don&#8217;t ask me how I know this.</li>
<li>The command to start the sync on the laptop is &#8220;lsyncd /home/jal/lsyncd/laptop/configfile&#8221; where configfile is the above lsyncd configuration file.</li>
<li>lsyncd could, conceivably, tell Unison to sync only the part of the directory tree that changed. I have not been able to make that feature work right, however. And it only takes Unison a few seconds to sync, so I&#8217;ve not worried about it.</li>
</ul>
<p>This has greatly sped up my <a href="http://rstudio.org" onclick="pageTracker._trackPageview('/outgoing/rstudio.org?referer=');">RStudio</a> based workflow when doing analysis with R. Now when I change files on my server using RStudio they are immediately (well it waits 15 seconds) replicated to my local machine and vice versa!</p>
<p>Good luck and if you have any suggestions please post a comment!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cerebralmastication.com/2011/04/details-of-two-way-sync-between-two-ubuntu-machines/feed/</wfw:commentRss>
		<slash:comments>30</slash:comments>
		</item>
		<item>
		<title>Fast Two Way Sync in Ubuntu!</title>
		<link>http://www.cerebralmastication.com/2011/04/fast-two-way-sync-in-ubuntu/</link>
		<comments>http://www.cerebralmastication.com/2011/04/fast-two-way-sync-in-ubuntu/#comments</comments>
		<pubDate>Sat, 09 Apr 2011 15:32:48 +0000</pubDate>
		<dc:creator>JD Long</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[sync]]></category>
		<category><![CDATA[workflow]]></category>

		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=955</guid>
		<description><![CDATA[I love the portability of a laptop. I have a 45 min train ride twice a day and I fly a little too, so having my work with me on my laptop is very important. But I hate doing long running analytics on my laptop when I&#8217;m in the office because it bogs down my [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.cerebralmastication.com/wp-content/uploads/2011/04/SyncDifferent.png"><img class="alignleft size-full wp-image-956" title="sync" src="http://www.cerebralmastication.com/wp-content/uploads/2011/04/SyncDifferent.png" alt="" width="128" height="128" /></a>I love the portability of a laptop. I have a 45 min train ride twice a day and I fly a little too, so having my work with me on my laptop is very important. But I hate doing long running analytics on my laptop when I&#8217;m in the office because it bogs down my laptop and all those videos on <a href="http://www.thesuperficial.com/" onclick="pageTracker._trackPageview('/outgoing/www.thesuperficial.com/?referer=');">The Superficial</a> get all jerky and stuff.</p>
<p>I get around this conundrum by running much of my analytics on either my work server or on an EC2 machine (I&#8217;m going to call these collectively &#8220;my servers&#8221; for the rest of this post). The nagging problem with this has been keeping files in sync. <a href="http://rstudio.org/" onclick="pageTracker._trackPageview('/outgoing/rstudio.org/?referer=');">RStudio Server</a> has been a great help to my workflow because it lets me edit files in my browser and they run on my servers. But when a long running R job blows out files I want those IMMEDIATELY synced with my laptop. That way I know when I undock my laptop to run to the train station that all my files will be there for me to spill Old Style beer on as I ride the Metra North line.</p>
<p><a href="http://www.cerebralmastication.com/wp-content/uploads/2011/04/dropbox_logo_home.png"><img class="alignleft size-full wp-image-958" style="margin: 5px;" title="dropbox_logo_home" src="http://www.cerebralmastication.com/wp-content/uploads/2011/04/dropbox_logo_home.png" alt="" width="209" height="54" /></a>I experimented with <a href="https://www.dropbox.com/" onclick="pageTracker._trackPageview('/outgoing/www.dropbox.com/?referer=');">Dropbox</a> and I gotta say, it&#8217;s great. It really is well engineered, fast, and drop dead simple. I love that with Dropbox I could pull up most any file from my Dropbox on my iPad or iPhone. That&#8217;s a very handy feature. And it&#8217;s fast. If I created a small text file on my server, it would be synced with my laptop in a few seconds. Perfect! Wel&#8230; almost. Dropbox has a huge limitation: encryption. Dropbox encrypts for transmission and may even store files encrypted on their end. However, Dropbox controls the key. So if a rogue employee, a crafty Russian hacker, or a law enforcement officer with a subpoena gained access to Dropbox, they could get access to my files without my knowledge. As a risk manager I can&#8217;t help but see Dropbox&#8217;s security as a huge, targeted, single point of failure. It&#8217;s hard to say which would be a bigger payday: cracking GMail, or cracking Dropbox. But I&#8217;m suspicious it&#8217;s Dropbox. There are some workarounds to try and shoehorn file encryption into Dropbox, and they all suck.</p>
<p><a href="http://www.cerebralmastication.com/wp-content/uploads/2011/04/logo.gif"><img class="alignleft size-full wp-image-960" style="margin: 5px; border: 0pt none;" title="logo" src="http://www.cerebralmastication.com/wp-content/uploads/2011/04/logo.gif" alt="" width="85" height="80" /></a>So Dropbox can&#8217;t really give me what I want (what I really really want). But I stumbled into <a href="https://spideroak.com/" onclick="pageTracker._trackPageview('/outgoing/spideroak.com/?referer=');">Spideroak</a> who are like the smarter, but lesser known cousins of Dropbox. Their software does everything Dropbox does (including tracking all revisions!) but they have a &#8220;trust no one&#8221; model which encrypts all files before leaving my computer using, and this is critical, MY key which they don&#8217;t store. Pretty cool, eh? Spideroak also has a iPad/iPhone app and offers a neat feature that allows emailing any file in my Spideroak &#8220;bucket&#8221; to anyone using my iPhone without having to upload the file to my iPhone first. They do this by sending a special link to the email recipient that allows them to open only the file you wanted them to have. This could be a huge bacon saver on the road.</p>
<p>So Spideroak&#8217;s the panacea then? Well&#8230; um&#8230; no. They have two critical flaws: 1) They depend on time stamps on files to determine most recent file. 2) Syncs are slow, sometimes taking more than 5 minutes for very small files. The time stamp issue is an engineering failure, plain and simple. I&#8217;ve talked to their tech support and been assured that they are going to change this and index using server time, not system time in the future. But as of April 6, 2011, Spideroak uses local system time. For most users this is no big deal. For my use case this is painful. My server and my laptop were 6 seconds different and that time difference was enough for me to get Spideroak confused about which files were the freshest. This is a big deal when syncing two file systems with fast changing files. The other issue, slow sync, was actually more painful but probably the result of their attempt to be nice with CPU time and also encryption. When jobs on my server finished, I expected those files to start syncing within seconds and the only delay I expected was bandwidth constraints. With Spideroak syncs might take 5 minutes to start and then it would go out for coffee, come back jittery and then finally complete. Even if SPideroak fixed the time sync issue (or I forced my laptop to set its time based on my server), it still would not work for my sync because of the huge lags.</p>
<p>So looking at Dropbox and Spideroak I realized that I liked everything about Spideroak except its sync. It&#8217;s a great cloud backup tool that seems to properly do encryption, it&#8217;s multiplatform (win, linux, mac), has an iPad/iPhone app for viewing/sending files, it&#8217;s smart about backups and won&#8217;t upload the same file twice (even if the file is on two different computers). For my business use, I just can&#8217;t use Dropbox. The lack of &#8220;trust no one&#8221; encryption is a deal killer. So what I really need is a sync solution to use along side Spideroak.</p>
<p>There are some neat projects out there for sync. Projects like <a href="http://www.sparkleshare.org/" onclick="pageTracker._trackPageview('/outgoing/www.sparkleshare.org/?referer=');">Sparkleshare</a> look really promising but they are trying to do all sorts of things, not just sync. I&#8217;ve already settled on letting Spideroak do backup and version tracking so I don&#8217;t really need all those features&#8230; OK, OK, I can hear you muttering, &#8220;just use rsync and be done with it already.&#8221; Yeah, that&#8217;s a good idea. But rsync is single directional and does a lot of things well, but can also be a bit of an asshole if you don&#8217;t set all the flags right and rub its belly the right way. If you google for &#8220;bidirectional sync&#8221; you&#8217;re going to see this problem has plagued a lot of folks. This blog post has already gone on long enough so I&#8217;ll cut to the chase. Here&#8217;s the stack of tools I settled on for cobbling together my own secure, real-time, bidirectional sync between two Ubuntu boxes (one of which changes IP address and is often behind a NAT router):</p>
<p>1) <a href="http://www.cis.upenn.edu/~bcpierce/unison/" onclick="pageTracker._trackPageview('/outgoing/www.cis.upenn.edu/_bcpierce/unison/?referer=');">Unison</a> &#8211; Fast sync using rsync-esque algos and really fast caching/scanning</p>
<p>2) <a href="http://code.google.com/p/lsyncd/" onclick="pageTracker._trackPageview('/outgoing/code.google.com/p/lsyncd/?referer=');">lsyncd</a> &#8211; Live (real-time) sync daemon</p>
<p>3) <a href="http://linux.die.net/man/1/autossh" onclick="pageTracker._trackPageview('/outgoing/linux.die.net/man/1/autossh?referer=');">autossh</a> &#8211; ssh client with a nifty wrapper that keeps the connection alive and respawns the connection if dropped</p>
<p>I&#8217;ll do another post with the nitty-gritty of how I set this up, but the short version is that I installed Unison and lsyncd on both the laptop and the server. Single direction sync from my laptop to the server is pretty straight forward: lsyncd watches files, if one changes it calls unison which syncs the files with the server. The tricky bit was getting my server to be able to sync with my laptop which is often behind a NAT router. The solution was to open an ssh connection from my laptop to my server using autossh and reverse port forward port 5555 from the server back to my laptop&#8217;s port 22. That way an lsyncd process on the server can monitor the file system and when it sees a change can kick off a unison job that syncs the server to ssh://localhost:5555//some/path which is forwarded to my laptop! Autossh makes sure that connection does not get dropped and respawns if it does get dropped. So with a little shell scripting to start the lsyncd daemon on both machines, some config of lsyncd, and a local shell script to fire off the autossh connection, I&#8217;ve got real-time bidirectional sync!</p>
<p>In a follow up post I&#8217;ll put of the details of this configuration. Stay tuned. (EDIT: <a href="http://www.cerebralmastication.com/2011/04/details-of-two-way-sync-between-two-ubuntu-machines/">Update posted</a>!)</p>
<p>If you&#8217;ve solved sync a different way and you like your solution, please comment. I&#8217;ve not settled that this is my long-term solution. It&#8217;s just a solution that works. Which is more than I had yesterday.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cerebralmastication.com/2011/04/fast-two-way-sync-in-ubuntu/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>The Downside of &quot;Flow&quot;&#8230; Technician&#039;s Myopia</title>
		<link>http://www.cerebralmastication.com/2009/03/the-downside-of-flow-technicians-myopia/</link>
		<comments>http://www.cerebralmastication.com/2009/03/the-downside-of-flow-technicians-myopia/#comments</comments>
		<pubDate>Tue, 10 Mar 2009 15:41:01 +0000</pubDate>
		<dc:creator>JD Long</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[flow]]></category>
		<category><![CDATA[workflow]]></category>

		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=224</guid>
		<description><![CDATA[It geeky circles there&#8217;s often talk of finding your &#8220;flow.&#8221; The term was coined by psychologist Mihaly Csikszentmihalyi in his book Flow: The Psychology of Optimal Experience and which he revisits in the smaller and more readable Finding Flow: The Psychology of Engagement with Everyday Life. The general idea is finding the state where you [...]]]></description>
			<content:encoded><![CDATA[<p>It geeky circles there&#8217;s often talk of finding your &#8220;flow.&#8221; The term was coined by psychologist Mihaly Csikszentmihalyi in his book <a href="http://www.amazon.com/gp/product/0061339202?ie=UTF8&amp;tag=riskthou-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0061339202" onclick="pageTracker._trackPageview('/outgoing/www.amazon.com/gp/product/0061339202?ie=UTF8_amp_tag=riskthou-20_amp_linkCode=as2_amp_camp=1789_amp_creative=390957_amp_creativeASIN=0061339202&amp;referer=');">Flow</a><a id="static_txt_preview" href="http://www.amazon.com/gp/product/0061339202?ie=UTF8&amp;tag=riskthou-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0061339202" onclick="pageTracker._trackPageview('/outgoing/www.amazon.com/gp/product/0061339202?ie=UTF8_amp_tag=riskthou-20_amp_linkCode=as2_amp_camp=1789_amp_creative=390957_amp_creativeASIN=0061339202&amp;referer=');">: The Psychology of Optimal Experience</a> and which he revisits in the smaller and more readable <a id="static_txt_preview" href="http://www.amazon.com/gp/product/0465024114?ie=UTF8&amp;tag=riskthou-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0465024114" onclick="pageTracker._trackPageview('/outgoing/www.amazon.com/gp/product/0465024114?ie=UTF8_amp_tag=riskthou-20_amp_linkCode=as2_amp_camp=1789_amp_creative=390957_amp_creativeASIN=0465024114&amp;referer=');">Finding Flow: The Psychology of Engagement with Everyday Life</a>. The general idea is finding the state where you lose yourself in what you are doing and find the experience fun, enjoyable, and productive.   There&#8217;s even a <a href="http://en.wikipedia.org/wiki/Flow_(psychology)" onclick="pageTracker._trackPageview('/outgoing/en.wikipedia.org/wiki/Flow_psychology?referer=');">Wikipedia article on the topic of flow</a>. I think that the idea of &#8216;finding flow&#8217; really resonates with knowlege workers for a couple of reasons. First, knowledge workers love a meaty task that they can &#8216;lose theirselves&#8217; in. It&#8217;s one of the few perks of knowledge work. The other reason this resonates so well with knowledge workers and male geeks in particular is video games. Video games are the ultimate flow finders. The Wikipedia article above lists 9 things that indicate a state of flow:</p>
<blockquote><p>1. <em>Clear goals</em> (expectations and rules are discernible and goals are attainable and align appropriately with one&#8217;s skill set and abilities).</p>
<p>2. <em>Concentrating and focusing</em>, a high degree of concentration on a limited field of attention (a person engaged in the activity will have the opportunity to focus and to delve deeply into it).</p>
<p>3. A <em>loss of the feeling of <a title="Self-consciousness" href="http://en.wikipedia.org/wiki/Self-consciousness" onclick="pageTracker._trackPageview('/outgoing/en.wikipedia.org/wiki/Self-consciousness?referer=');">self-consciousness</a></em>, the merging of action and awareness.</p>
<p>4. <em>Distorted sense of time</em>, one&#8217;s subjective experience of time is altered.</p>
<p>5. Direct and immediate <em>feedback</em> (successes and failures in the course of the activity are apparent, so that behavior can be adjusted as needed).</p>
<p>6. <em>Balance between ability level and challenge</em> (the activity is neither too easy nor too difficult).</p>
<p>7. A sense of personal <em>control</em> over the situation or activity.</p>
<p>8. The activity is <em>intrinsically rewarding</em>, so there is an effortlessness of action.</p>
<p>9. People become absorbed in their activity, and focus of awareness is narrowed down to the activity itself, <em>action awareness merging</em>.</p></blockquote>
<p>Wow, that&#8217;s just like me playing <a href="http://en.wikipedia.org/wiki/Call_of_Duty_4" onclick="pageTracker._trackPageview('/outgoing/en.wikipedia.org/wiki/Call_of_Duty_4?referer=');">Call of Duty 4! </a>I recall being an undergraduate and playing <a href="http://en.wikipedia.org/wiki/Wolfenstein_3D" onclick="pageTracker._trackPageview('/outgoing/en.wikipedia.org/wiki/Wolfenstein_3D?referer=');">Castle Wolfenstein 3D </a>for so long that when I laid down to sleep I felt motion sick because I could still see the game in my head. I would get my flow on and not be able to stop playing. It was really addictive.</p>
<p>That addictive nature of flow is really the nature of the downside of flow. I had a &#8216;downside of flow&#8217; experience in my first job after grad school. I was a consultant in a small firm and we did analytical modeling. Once challenge we had was to fit distributions to unknown data. We didn&#8217;t have the underlying data but we did know a couple points on the distribution and we could infer some things about the general behavior of the distribution. This is a fairly odd thing to do, actually. So there were no canned SAS routines to call. Code had to be written. After I was sure I understood the problem and worked with one of the senior principles of the firm to test some ideas I ran off to bang out code. I got totally absorbed in this project. I laid in bed at night and visualized optimization routines and transposed matrices. I would get up in the morning excited to go to work and stay late to tweak my code. After about a week the principle of the firm asked me how it was going and I showed him my work. He nodded and said, &#8220;Well good. What else have you done this week?&#8221; All I could think was, &#8220;WTF do you mean by &#8216;what else?&#8217;&#8221; I stuttered a bit and said that this had taken  all week. He was noticeably chagrined. He rubbed his forehead and said, &#8220;this is not an all week project. You should never have thrown all week at this, we have other things to do.&#8221; It felt like he kicked me in the balls. The very thing I was enjoying and proud of he thought was a waste of time. Ouch.</p>
<p>That&#8217;s been a few years and I have had the pleasure supervising others who are learning about both sides of flow. I&#8217;ve also lost days working on things that in retrospect I should not have sunk so much time into. I&#8217;ve taken to calling this type of flow as &#8220;<strong>technician&#8217;s myopia</strong>&#8220;. As a technical person it is soooo easy to get enthralled with the technical challenge at hand and totally lose context. Yet that is what flow is all about; getting totally sucked in. I&#8217;ve read stories of an academic statistician in the 60&#8217;s  who got his first computer and completely stopped doing academic research because he got sucked into learning all he could about the computer.</p>
<p>Technician&#8217;s myopia  is a real productivity killer for those of us in technical fields. What makes it so hard is the lack of absolutes. If I am having a bit of a slow time and I want to spend two days researching systems for distributed regression using Amazon EC2, that might be a good use of time as long as I limit myself to two days. If, on the other hand, I am working against a deadline and I take a tangent and lose myself in that tangent, I can waste days at the worst possible time.  How can us techie types keep from having technician&#8217;s myopia? Here&#8217;s my recommendations:</p>
<ol>
<li><strong>Self Awareness </strong>- If you want to have flow all the time and work on the wrong things while in that flow, have at it. Don&#8217;t read the other points below. Keep your resume polished, however, as you will need it. However if you want to flow when it&#8217;s right yet keep that flow focused, read on.</li>
<li><strong>Management </strong>- If you have a technical manager he/she is most likely well aware of technician&#8217;s myopia, although the term may be new. Talk to them about it. If you are doing weekly project reviews then you can&#8217;t have myopia for more than a week. If you struggle with this a lot, ask for 15 minutes every morning to ensure that you are focused properly.</li>
<li><strong>Buddy Dive </strong>- You may not have the type of manager who can understand technician&#8217;s myopia. In that case take a tip from the Navy Seals and buddy dive. Get one of your technical coworkers to sit down with you once or twice a week to discuss your projects and how they are going. Specifically ask each other if you are struggling with spending time on the wrong parts. Remember, it&#8217;s ok to flow, just don&#8217;t get lost in it to the detriment of what really matters.</li>
<li><strong>Set Clear Deadlines </strong>- Having clear deadlines and even intermediate deadlines is just good self management. The neat thing about it is having clear deadlines and focused work attention actually helps most folks flow and prevents technician&#8217;s myopia. Double bonus!</li>
<li><strong>Free Flow Time </strong>- Set aside time every week to work on projects which you are drawn to but which are off focus from your main work. This may be personal time or work time depending on your situation. Outside of work I have to have periodic &#8216;garage therapy&#8217; where I disappear into the garage to get my flow on while working on my project car. My wife knows and understands this. If you desire flow but don&#8217;t get enough of it, you will feel restless and cranky. So make sure you have an outlet. But put a time frame on it. You will hate your job even more if you stay up all night killing headcrabs.</li>
</ol>
<p><object width="425" height="344" data="http://www.youtube.com/v/oPofh1UiMKA&amp;hl=en&amp;fs=1" type="application/x-shockwave-flash"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/oPofh1UiMKA&amp;hl=en&amp;fs=1" /><param name="allowfullscreen" value="true" /></object></p>
<p>PS: as of  March 10, 2009 Google returns zero hits for &#8220;technician&#8217;s myopia&#8221; if you put it in quotes. Hmm&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cerebralmastication.com/2009/03/the-downside-of-flow-technicians-myopia/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data Analysis Workflow&#8230; Part 1 of Infinity</title>
		<link>http://www.cerebralmastication.com/2009/02/data-analysis-workflow-part-1-of-infinity/</link>
		<comments>http://www.cerebralmastication.com/2009/02/data-analysis-workflow-part-1-of-infinity/#comments</comments>
		<pubDate>Thu, 26 Feb 2009 14:41:04 +0000</pubDate>
		<dc:creator>JD Long</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[workflow]]></category>

		<guid isPermaLink="false">http://www.cerebralmastication.com/?p=179</guid>
		<description><![CDATA[One of the many things that I sit around pondering when I should be doing productive things is the idea of analytical workflow. I have only worked with one analytical guru who I felt really gave thought and structure to workflow and its impact on analyist productivity. When I talk about workflow I mean the [...]]]></description>
			<content:encoded><![CDATA[<p>One of the many things that I sit around pondering when I should be doing productive things is the idea of analytical workflow. I have only worked with one analytical guru who I felt really gave thought and structure to workflow and its impact on analyist productivity. When I talk about workflow I mean the whole process from the time the analytical guy thinks, &#8220;Hey, I need to understand the velocity of new purchases between different types of sales campaigns.&#8221; until he writes down his findings in a presentation or even just a notebook. In the middle I assume this guy extracts some data from a warehouse or live system, does some work on said data, tests some theories, does more stuff, goes and gets coffee, comes back and plays some flash games, goes home and does it again the next day.</p>
<p>Today I was reading over at Data Evolution about <a href="http://dataspora.com/blog/predictive-analytics-using-r/" onclick="pageTracker._trackPageview('/outgoing/dataspora.com/blog/predictive-analytics-using-r/?referer=');">a presentation on how Google and Facebook use R</a>. The following was a summary of what Bo Cowgill of Google said about his workflow:</p>
<blockquote><p>The typical workflow that Bo thus described for using R was: (i) pulling data with some external tool, (ii) loading it into R, (iii) performing analysis and modeling within R, (iv) implementing a resulting model in Python or C++ for a production environment.</p></blockquote>
<p>I found this interesting as I have been masticating on the idea of learning Python for some time. I have run into situations where R was slow, but generally I have solved those through rethinking my algorithm. I&#8217;m not really a good programmer in R (or any other language for that matter), but I do want/need/like the statistical functions and ease of plotting in R. If I do learn Python I&#8217;ll certainly use it to call R&#8230; but maybe I should just stick to R.</p>
<p>This has nothing to do with workflow, but the most thought provoking insights in the article above came from Itamar Rosenn at Facebook:</p>
<blockquote><p>Itamar’s team used recursive partitioning (via the <a href="http://cran.r-project.org/web/packages/rpart" onclick="pageTracker._trackPageview('/outgoing/cran.r-project.org/web/packages/rpart?referer=');">rpart</a> package) to infer that just two data points are significantly predictive of whether a user remains on Facebook: (i) having more than one session as a new user, and (ii) entering basic profile information.</p>
<p>&#8230; [they also] found that activity at three months was predicted by variables related to three classes of behavior: (i) how often a user was reached out to by others, (ii) frequency of third party application use, and (iii) what Itamar termed “receptiveness” — related to how forthcoming a user was on the site.</p></blockquote>
<p>So Facebook really wants new users to put more info into FB, use it more, and play with third party apps. I guess that logic is why LinkedIn is always telling me I am only 90% complete on my profile and I would be 95% if I would just, yada yada yada&#8230; The more info I put into their walled garden, the more I will play there. And the more ads I will see. Makes sense to me. I guess I follow the same model when I try to get my clients to use my services more and more&#8230; I want to be sticky too. But not in a bad way.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cerebralmastication.com/2009/02/data-analysis-workflow-part-1-of-infinity/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

