[data-carpentry-discuss] Organizing Data Carpentry workshop - advice

Ted Hart edmund.m.hart at gmail.com
Thu Jan 29 08:55:19 PST 2015


Hi Leah,

Glad to see that NEON supporting you putting on the workshop. Having read
your e-mails it sounds like what you want is partially in the domain of
Software Carpentry.  I think SWC would give them many things they might
need, but not know they need.  For instance you heard that people are
interested in loops, automating and optimization.  Well, those are all
pretty hard to do without a decent understanding of variables and data
structures, and even optimization (profiling to start) needs functions.

To expand a bit, how can you do loops and automation in any language
without understanding different basic data structures..e.g. looping over a
list is different than looping over a vector which is different than
looping over a dataframe in R.  So I wonder if you could just sort of slip
those concepts in while focusing on the parts people want to learn and
they'll learn more without even knowing it :). Git is also probably
essential because I know more and more NEON work is being put on github.
In the end workshop might end up looking like a hybrid of an advanced data
carpentry (dealing with complex formats, SQL) and some basic software
carpentry.

I don't know if this is heresy, but it might be best to pull from both
curriculums to meet the specific needs of NEON.

Ted



On Thu Jan 29 2015 at 8:03:18 AM Leah Wasser <lwasser at neoninc.org> wrote:

>  HI Ethan, Tracy and Kai,
>
>
>
> Thank you ALL so much for the feedback so far. I truly appreciate it.
>
>
>
> I gave folks here until next week to fill out the survey. As of now, I'm
> at 18 responses - most of which are open / interested in a full 2 day
> workshop. 6-Python, 12 - R. Most are interested in SQL as well.
>
>
>
> Pulling Git makes sense and most would benefit from a short section on
> shell. I think most really want to focus on R or python. I also am open to
> doing two workshops or splitting materials if that makes sense to cover
> everyone's interests. For instance, I thought about just doing an afternoon
> focused on manipulating HDF5 data in R.
>
>
>
> In this case, our audience are not novice users. They self identify as
> Beginner / intermediate - ie they're programming now - many regularly. Many
> don't have formal training and want to hone skills. Does that by default
> make DC - not quite the right fit here? Thoughts?
>
>
>
> Topically they are interested in
>
> 1.       Hierarchical data formats (I've built materials for that)
>
> 2.       Spatial data (I have a bit on that as well
>
> 3.       Automating processes, looping, optimization
>
> 4.       I'm sure data viz as well (I didn't include that in the list
>
>
>
> Very few were interested in the core programming skills (ie creating
> functions, variables, etc). I think because they are already implementing
> those skills.
>
>
>
> Thank you again for any feedback / advice, etc.
>
> Leah
>
>
>
> *From:* Tracy Teal [mailto:tkteal at datacarpentry.org]
> *Sent:* Wednesday, January 28, 2015 8:40 PM
> *To:* Hsi-Kai (Kai) Yang
> *Cc:* Leah Wasser; dc-discuss at lists.idyll.org
> *Subject:* Re: [data-carpentry-discuss] Organizing Data Carpentry
> workshop - advice
>
>
>
> Hi Leah,
>
> That's great there's good interest! I agree with Ethan both that the
> lessons for Python and R are redundant and that it would be tough to teach
> both languages in two days. Maybe the question is more, what are people
> looking to do? Are they doing statistical analysis and data visualization
> with ecological data? If so, then R might be better. If they want to write
> scripts for data parsing or work with a lot of colleagues who are working
> in Python, that might be better.
>
> As Kai says, the current Data Carpentry workshop components are focused on
> data organization/management and the data
> analysis/presentation/visualization.
>
> The current modules are:
>
> - spreadsheets for data organization
> - OpenRefine for data cleaning (30 minute demo)
> - SQL for managing and querying data
> - introduction to R or Python for data analysis and visualization
> - the shell for automation
>
> If you wanted to be able to spend more time on R or Python, which we're
> finding people are interested in, especially being able to get through more
> of the data visualization, you could leave out the shell or SQL and use
> that extra time for R or Python.
>
> The mix of experience is always a challenge. Data Carpentry lessons right
> now have been developed for people with little to no prior computational
> experience, so no prior experience is expected or required. This means
> things can be a little slower for people who do have some experience, but
> has the advantage that we're clear about the level up front and it doesn't
> leave as many people behind. People with more experience still learn new
> tips and tricks for the things they've seen already and can help their
> neighbors, and often even if someone is experience with one tool, they
> might be new to another - so maybe they know SQL well, but haven't worked
> in R before.
>
> This focus on learners newer to computation, does mean, as Ethan mentioned
> that we're also not currently teaching git. As a concept, it's more
> advanced than what most people new to programming are ready for.  We have
> talked about adding to the R lesson a component about working with github
> from within RStudio, as that takes away some of the complexity, but haven't
> had a chance to develop that or try it out yet.
>
> Does this approach and modules seem like it matches with what the people
> there need?
>
> Best,
> -Tracy
>
>     *Hsi-Kai (Kai) Yang* <hky2 at uw.edu>
>
> January 28, 2015 at 5:25 PM
>
> Leah:
>
> You might want to focus on either (1) data preparation/munging/management,
> or (2) data analysis/presentation. Data exploration probably is in between
> the two areas.
>
> It could be too aggressive trying to cover all aspects of data science in
> two days.
>
> Also you might want to assume the attendees can master at least one
> programming language. I believe learning how to program belongs to software
> carpentry. Data carpentry is all about data science.
>
> My 2 cent.
>
> Thanks.
>
> -kai
>
>
>
> _______________________________________________
> dc-discuss mailing list
> dc-discuss at lists.idyll.org
> http://lists.idyll.org/listinfo/dc-discuss
>
>   *Leah Wasser* <lwasser at neoninc.org>
>
> January 28, 2015 at 12:09 AM
>
> HI Tracey, and fellow DC participants.
> I am looking for some advice. There has been ongoing interest in a Data
> Carpentry workshop at NEON (where i work for those of you who don't know
> me). :)
>
> The challenge that I see at this point, is figuring out what content would
> be most relevant. I posted a survey today and already have a handful of
> responses - all interested in a 2 day workshop.
>
> However there is a mix of interest in Python vs R. And some mix of
> background (mostly intermediate focused however).
>
> I am giving everyone a week to respond to the survey. Then I need to
> figure out an approach. Depending upon the volume of responses, i am even
> thinking about something that is split across days (R one day, python
> another). Git and shell combined? OR SQL ? Can anyone help guide me through
> the logistics of deciding the best approach for this workshop once the
> survey results are in?
>
> Thank you in advance!!
> leah
>
> Leah A. Wasser, Ph.D.
> Remote Sensing Ecologist
> Senior Science Educator - Universities
> National Ecological Observatory Network (NEON)
> Boulder, Colorado
>
> ________________________________________
> From: dc-discuss-bounces at lists.idyll.org
> <dc-discuss-bounces at lists.idyll.org> <dc-discuss-bounces at lists.idyll.org>
> on behalf of dc-discuss-request at lists.idyll.org
> <dc-discuss-request at lists.idyll.org> <dc-discuss-request at lists.idyll.org>
> Sent: Tuesday, January 27, 2015 1:00 PM
> To: dc-discuss at lists.idyll.org
> Subject: dc-discuss Digest, Vol 3, Issue 2
>
> Send dc-discuss mailing list submissions to
> dc-discuss at lists.idyll.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.idyll.org/listinfo/dc-discuss
> or, via email, send a message with subject or body 'help' to
> dc-discuss-request at lists.idyll.org
>
> You can reach the person managing the list at
> dc-discuss-owner at lists.idyll.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of dc-discuss digest..."
>
>
> Today's Topics:
>
> 1. Data Carpentry Genomics and Assessment hackathon (Tracy Teal)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 27 Jan 2015 10:34:30 -0500
> From: Tracy Teal <tkteal at datacarpentry.org> <tkteal at datacarpentry.org>
> Subject: [data-carpentry-discuss] Data Carpentry Genomics and
> Assessment hackathon
> To: discuss at datacarpentry.org
> Message-ID: <54C7B006.4060605 at datacarpentry.org>
> <54C7B006.4060605 at datacarpentry.org>
> Content-Type: text/plain; charset=windows-1252; format=flowed
>
> If you're working or interested in genomics or assessment, we hope
> you'll consider applying for our upcoming Data Carpentry Genomics and
> Assessment hackathon. We?re very excited about this event and the
> opportunity to develop lessons targeting genomics researchers and build
> assessment in to the curriculum. Travel support is available. Please
> apply to participate!
>
> Dates: March 23-25, 2015
> Location: Cold Spring Harbor Labs, NY
>
> It's a short application and the deadline is this Friday, January 30th.
>
> Call for Participation:
>
> https://docs.google.com/document/d/1r5Bfc-Igt7Hd8kjXsuPw7SenOHkxIQbEDbtnZfAxXbA/pub
>
> Application:
>
> https://docs.google.com/forms/d/17cSQyPIvTIhCQGrFLRoQ0kSway1ZyxgRm9QL85BW8v8/viewform
>
> If you have any questions about the event, please let me know!
>
> Best,
> -Tracy
>
>
>
> ------------------------------
>
> _______________________________________________
> dc-discuss mailing list
> dc-discuss at lists.idyll.org
> http://lists.idyll.org/listinfo/dc-discuss
>
>
> End of dc-discuss Digest, Vol 3, Issue 2
> ****************************************
>
> _______________________________________________
> dc-discuss mailing list
> dc-discuss at lists.idyll.org
> http://lists.idyll.org/listinfo/dc-discuss
>   _______________________________________________
> dc-discuss mailing list
> dc-discuss at lists.idyll.org
> http://lists.idyll.org/listinfo/dc-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/dc-discuss/attachments/20150129/92518988/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 770 bytes
Desc: not available
URL: <http://lists.idyll.org/pipermail/dc-discuss/attachments/20150129/92518988/attachment-0002.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 770 bytes
Desc: not available
URL: <http://lists.idyll.org/pipermail/dc-discuss/attachments/20150129/92518988/attachment-0003.jpg>


More information about the dc-discuss mailing list