[data-carpentry-discuss] Organizing Data Carpentry workshop - advice

Christie Bahlai cbahlai at msu.edu
Thu Jan 29 09:15:50 PST 2015


Hi Leah, Ted, et al;
 
If it's heresy, oops, I'm a heretic ;). I recently instructed at a SC
bootcamp at U of M, and we had learners divided into two rooms- novice and
intermediate. Looking at the pre-workshop surveys, I decided that the novice
room needed a bit more of an introductory go at object manipulation than the
standard SC materials provided, so I used the DC materials for the first
half of the R lesson. My co-instructors also worked in some of the DC data
management curriculum in with the shell lesson.

Is this slight hybridization kosher? All I know is the students are
apparently asking for a DC bootcamp as well!
 
Cheers,
-Christie
 
From: dc-discuss-bounces at lists.idyll.org
[mailto:dc-discuss-bounces at lists.idyll.org] On Behalf Of Ted Hart
Sent: Thursday, January 29, 2015 11:55 AM
To: Leah Wasser; Tracy Teal; Hsi-Kai (Kai) Yang
Cc: dc-discuss at lists.idyll.org
Subject: Re: [data-carpentry-discuss] Organizing Data Carpentry workshop -
advice
 
Hi Leah,

Glad to see that NEON supporting you putting on the workshop. Having read
your e-mails it sounds like what you want is partially in the domain of
Software Carpentry.  I think SWC would give them many things they might
need, but not know they need.  For instance you heard that people are
interested in loops, automating and optimization.  Well, those are all
pretty hard to do without a decent understanding of variables and data
structures, and even optimization (profiling to start) needs functions.  
 
To expand a bit, how can you do loops and automation in any language without
understanding different basic data structures..e.g. looping over a list is
different than looping over a vector which is different than looping over a
dataframe in R.  So I wonder if you could just sort of slip those concepts
in while focusing on the parts people want to learn and they'll learn more
without even knowing it :). Git is also probably essential because I know
more and more NEON work is being put on github.  In the end workshop might
end up looking like a hybrid of an advanced data carpentry (dealing with
complex formats, SQL) and some basic software carpentry.   
 
I don't know if this is heresy, but it might be best to pull from both
curriculums to meet the specific needs of NEON.  
 
Ted
 
 
 
On Thu Jan 29 2015 at 8:03:18 AM Leah Wasser <lwasser at neoninc.org> wrote:
HI Ethan, Tracy and Kai,
 
Thank you ALL so much for the feedback so far. I truly appreciate it.
 
I gave folks here until next week to fill out the survey. As of now, I'm at
18 responses - most of which are open / interested in a full 2 day workshop.
6-Python, 12 - R. Most are interested in SQL as well. 
 
Pulling Git makes sense and most would benefit from a short section on
shell. I think most really want to focus on R or python. I also am open to
doing two workshops or splitting materials if that makes sense to cover
everyone's interests. For instance, I thought about just doing an afternoon
focused on manipulating HDF5 data in R. 
 
In this case, our audience are not novice users. They self identify as
Beginner / intermediate - ie they're programming now - many regularly. Many
don't have formal training and want to hone skills. Does that by default
make DC - not quite the right fit here? Thoughts?
 
Topically they are interested in 
1.       Hierarchical data formats (I've built materials for that)
2.       Spatial data (I have a bit on that as well
3.       Automating processes, looping, optimization
4.       I'm sure data viz as well (I didn't include that in the list
 
Very few were interested in the core programming skills (ie creating
functions, variables, etc). I think because they are already implementing
those skills. 
 
Thank you again for any feedback / advice, etc.
Leah 
 
From: Tracy Teal [mailto:tkteal at datacarpentry.org] 
Sent: Wednesday, January 28, 2015 8:40 PM
To: Hsi-Kai (Kai) Yang
Cc: Leah Wasser; dc-discuss at lists.idyll.org
Subject: Re: [data-carpentry-discuss] Organizing Data Carpentry workshop -
advice
 
Hi Leah,

That's great there's good interest! I agree with Ethan both that the lessons
for Python and R are redundant and that it would be tough to teach both
languages in two days. Maybe the question is more, what are people looking
to do? Are they doing statistical analysis and data visualization with
ecological data? If so, then R might be better. If they want to write
scripts for data parsing or work with a lot of colleagues who are working in
Python, that might be better. 

As Kai says, the current Data Carpentry workshop components are focused on
data organization/management and the data
analysis/presentation/visualization.

The current modules are:

- spreadsheets for data organization
- OpenRefine for data cleaning (30 minute demo)
- SQL for managing and querying data
- introduction to R or Python for data analysis and visualization
- the shell for automation

If you wanted to be able to spend more time on R or Python, which we're
finding people are interested in, especially being able to get through more
of the data visualization, you could leave out the shell or SQL and use that
extra time for R or Python. 

The mix of experience is always a challenge. Data Carpentry lessons right
now have been developed for people with little to no prior computational
experience, so no prior experience is expected or required. This means
things can be a little slower for people who do have some experience, but
has the advantage that we're clear about the level up front and it doesn't
leave as many people behind. People with more experience still learn new
tips and tricks for the things they've seen already and can help their
neighbors, and often even if someone is experience with one tool, they might
be new to another - so maybe they know SQL well, but haven't worked in R
before. 

This focus on learners newer to computation, does mean, as Ethan mentioned
that we're also not currently teaching git. As a concept, it's more advanced
than what most people new to programming are ready for.  We have talked
about adding to the R lesson a component about working with github from
within RStudio, as that takes away some of the complexity, but haven't had a
chance to develop that or try it out yet. 

Does this approach and modules seem like it matches with what the people
there need? 

Best,
-Tracy

 <mailto:hky2 at uw.edu> Hsi-Kai (Kai) Yang
January 28, 2015 at 5:25 PM
Leah:
You might want to focus on either (1) data preparation/munging/management,
or (2) data analysis/presentation. Data exploration probably is in between
the two areas. 
It could be too aggressive trying to cover all aspects of data science in
two days.  
Also you might want to assume the attendees can master at least one
programming language. I believe learning how to program belongs to software
carpentry. Data carpentry is all about data science.
My 2 cent.
Thanks.
-kai
 
_______________________________________________
dc-discuss mailing list
dc-discuss at lists.idyll.org
http://lists.idyll.org/listinfo/dc-discuss

 <mailto:lwasser at neoninc.org> Leah Wasser
January 28, 2015 at 12:09 AM
HI Tracey, and fellow DC participants.
I am looking for some advice. There has been ongoing interest in a Data
Carpentry workshop at NEON (where i work for those of you who don't know
me). :) 

The challenge that I see at this point, is figuring out what content would
be most relevant. I posted a survey today and already have a handful of
responses - all interested in a 2 day workshop. 

However there is a mix of interest in Python vs R. And some mix of
background (mostly intermediate focused however). 

I am giving everyone a week to respond to the survey. Then I need to figure
out an approach. Depending upon the volume of responses, i am even thinking
about something that is split across days (R one day, python another). Git
and shell combined? OR SQL ? Can anyone help guide me through the logistics
of deciding the best approach for this workshop once the survey results are
in?

Thank you in advance!!
leah

Leah A. Wasser, Ph.D.
Remote Sensing Ecologist
Senior Science Educator - Universities
National Ecological Observatory Network (NEON)
Boulder, Colorado

________________________________________
From: dc-discuss-bounces at lists.idyll.org
<mailto:dc-discuss-bounces at lists.idyll.org>
<dc-discuss-bounces at lists.idyll.org> on behalf of
dc-discuss-request at lists.idyll.org
<mailto:dc-discuss-request at lists.idyll.org>
<dc-discuss-request at lists.idyll.org>
Sent: Tuesday, January 27, 2015 1:00 PM
To: dc-discuss at lists.idyll.org
Subject: dc-discuss Digest, Vol 3, Issue 2

Send dc-discuss mailing list submissions to
dc-discuss at lists.idyll.org

To subscribe or unsubscribe via the World Wide Web, visit
http://lists.idyll.org/listinfo/dc-discuss
or, via email, send a message with subject or body 'help' to
dc-discuss-request at lists.idyll.org

You can reach the person managing the list at
dc-discuss-owner at lists.idyll.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of dc-discuss digest..."


Today's Topics:

1. Data Carpentry Genomics and Assessment hackathon (Tracy Teal)


----------------------------------------------------------------------

Message: 1
Date: Tue, 27 Jan 2015 10:34:30 -0500
From: Tracy Teal  <mailto:tkteal at datacarpentry.org>
<tkteal at datacarpentry.org>
Subject: [data-carpentry-discuss] Data Carpentry Genomics and
Assessment hackathon
To: discuss at datacarpentry.org
Message-ID:  <mailto:54C7B006.4060605 at datacarpentry.org>
<54C7B006.4060605 at datacarpentry.org>
Content-Type: text/plain; charset=windows-1252; format=flowed

If you're working or interested in genomics or assessment, we hope
you'll consider applying for our upcoming Data Carpentry Genomics and
Assessment hackathon. We?re very excited about this event and the
opportunity to develop lessons targeting genomics researchers and build
assessment in to the curriculum. Travel support is available. Please
apply to participate!

Dates: March 23-25, 2015
Location: Cold Spring Harbor Labs, NY

It's a short application and the deadline is this Friday, January 30th.

Call for Participation:
https://docs.google.com/document/d/1r5Bfc-Igt7Hd8kjXsuPw7SenOHkxIQbEDbtnZfAx
XbA/pub

Application:
https://docs.google.com/forms/d/17cSQyPIvTIhCQGrFLRoQ0kSway1ZyxgRm9QL85BW8v8
/viewform

If you have any questions about the event, please let me know!

Best,
-Tracy



------------------------------

_______________________________________________
dc-discuss mailing list
dc-discuss at lists.idyll.org
http://lists.idyll.org/listinfo/dc-discuss


End of dc-discuss Digest, Vol 3, Issue 2
****************************************

_______________________________________________
dc-discuss mailing list
dc-discuss at lists.idyll.org
http://lists.idyll.org/listinfo/dc-discuss
_______________________________________________
dc-discuss mailing list
dc-discuss at lists.idyll.org
http://lists.idyll.org/listinfo/dc-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/dc-discuss/attachments/20150129/ddd5fc99/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 770 bytes
Desc: not available
URL: <http://lists.idyll.org/pipermail/dc-discuss/attachments/20150129/ddd5fc99/attachment-0001.jpeg>


More information about the dc-discuss mailing list