[data-carpentry-discuss] Organizing Data Carpentry workshop - advice

Leah Wasser lwasser at neoninc.org
Fri Jan 30 09:31:34 PST 2015


Thank you Ted, Christie and Greg et al!
This advice is extremely useful. I just rechecked the spreadsheet. I am up to 27 responses - most of  whome are intermediate level coders although some might be learning a new language (R to python, etc) so I will do a followup survey.

I've got 12 for python and 15 for R.  I might get a few more next week. Looks like a custom approach will be warranted.  I'll dig into this a bit more.

If you are all open, I'd love to run ideas by the list serve or by this group.
Leah


From: Greg Wilson [mailto:gvwilson at software-carpentry.org]
Sent: Thursday, January 29, 2015 3:51 PM
To: Christie Bahlai; 'Ted Hart'; Leah Wasser; 'Tracy Teal'; 'Hsi-Kai (Kai) Yang'
Cc: dc-discuss at lists.idyll.org
Subject: Re: [data-carpentry-discuss] Organizing Data Carpentry workshop - advice

Hi all,

Hybridization engenders robustness - sure, do what's best for the audience and tell us afterward how it went.

Thanks
Greg
On 2015-01-29 12:15 PM, Christie Bahlai wrote:
Hi Leah, Ted, et al;

If it's heresy, oops, I'm a heretic ;). I recently instructed at a SC bootcamp at U of M, and we had learners divided into two rooms- novice and intermediate. Looking at the pre-workshop surveys, I decided that the novice room needed a bit more of an introductory go at object manipulation than the standard SC materials provided, so I used the DC materials for the first half of the R lesson. My co-instructors also worked in some of the DC data management curriculum in with the shell lesson.

Is this slight hybridization kosher? All I know is the students are apparently asking for a DC bootcamp as well!

Cheers,
-Christie

From: dc-discuss-bounces at lists.idyll.org<mailto:dc-discuss-bounces at lists.idyll.org> [mailto:dc-discuss-bounces at lists.idyll.org] On Behalf Of Ted Hart
Sent: Thursday, January 29, 2015 11:55 AM
To: Leah Wasser; Tracy Teal; Hsi-Kai (Kai) Yang
Cc: dc-discuss at lists.idyll.org<mailto:dc-discuss at lists.idyll.org>
Subject: Re: [data-carpentry-discuss] Organizing Data Carpentry workshop - advice

Hi Leah,

Glad to see that NEON supporting you putting on the workshop. Having read your e-mails it sounds like what you want is partially in the domain of Software Carpentry.  I think SWC would give them many things they might need, but not know they need.  For instance you heard that people are interested in loops, automating and optimization.  Well, those are all pretty hard to do without a decent understanding of variables and data structures, and even optimization (profiling to start) needs functions.

To expand a bit, how can you do loops and automation in any language without understanding different basic data structures..e.g. looping over a list is different than looping over a vector which is different than looping over a dataframe in R.  So I wonder if you could just sort of slip those concepts in while focusing on the parts people want to learn and they'll learn more without even knowing it :). Git is also probably essential because I know more and more NEON work is being put on github.  In the end workshop might end up looking like a hybrid of an advanced data carpentry (dealing with complex formats, SQL) and some basic software carpentry.

I don't know if this is heresy, but it might be best to pull from both curriculums to meet the specific needs of NEON.

Ted



On Thu Jan 29 2015 at 8:03:18 AM Leah Wasser <lwasser at neoninc.org<mailto:lwasser at neoninc.org>> wrote:
HI Ethan, Tracy and Kai,

Thank you ALL so much for the feedback so far. I truly appreciate it.

I gave folks here until next week to fill out the survey. As of now, I'm at 18 responses - most of which are open / interested in a full 2 day workshop. 6-Python, 12 - R. Most are interested in SQL as well.

Pulling Git makes sense and most would benefit from a short section on shell. I think most really want to focus on R or python. I also am open to doing two workshops or splitting materials if that makes sense to cover everyone's interests. For instance, I thought about just doing an afternoon focused on manipulating HDF5 data in R.

In this case, our audience are not novice users. They self identify as Beginner / intermediate - ie they're programming now - many regularly. Many don't have formal training and want to hone skills. Does that by default make DC - not quite the right fit here? Thoughts?

Topically they are interested in

1.       Hierarchical data formats (I've built materials for that)

2.       Spatial data (I have a bit on that as well

3.       Automating processes, looping, optimization

4.       I'm sure data viz as well (I didn't include that in the list

Very few were interested in the core programming skills (ie creating functions, variables, etc). I think because they are already implementing those skills.

Thank you again for any feedback / advice, etc.
Leah

From: Tracy Teal [mailto:tkteal at datacarpentry.org<mailto:tkteal at datacarpentry.org>]
Sent: Wednesday, January 28, 2015 8:40 PM
To: Hsi-Kai (Kai) Yang
Cc: Leah Wasser; dc-discuss at lists.idyll.org<mailto:dc-discuss at lists.idyll.org>
Subject: Re: [data-carpentry-discuss] Organizing Data Carpentry workshop - advice

Hi Leah,

That's great there's good interest! I agree with Ethan both that the lessons for Python and R are redundant and that it would be tough to teach both languages in two days. Maybe the question is more, what are people looking to do? Are they doing statistical analysis and data visualization with ecological data? If so, then R might be better. If they want to write scripts for data parsing or work with a lot of colleagues who are working in Python, that might be better.

As Kai says, the current Data Carpentry workshop components are focused on data organization/management and the data analysis/presentation/visualization.

The current modules are:

- spreadsheets for data organization
- OpenRefine for data cleaning (30 minute demo)
- SQL for managing and querying data
- introduction to R or Python for data analysis and visualization
- the shell for automation

If you wanted to be able to spend more time on R or Python, which we're finding people are interested in, especially being able to get through more of the data visualization, you could leave out the shell or SQL and use that extra time for R or Python.

The mix of experience is always a challenge. Data Carpentry lessons right now have been developed for people with little to no prior computational experience, so no prior experience is expected or required. This means things can be a little slower for people who do have some experience, but has the advantage that we're clear about the level up front and it doesn't leave as many people behind. People with more experience still learn new tips and tricks for the things they've seen already and can help their neighbors, and often even if someone is experience with one tool, they might be new to another - so maybe they know SQL well, but haven't worked in R before.

This focus on learners newer to computation, does mean, as Ethan mentioned that we're also not currently teaching git. As a concept, it's more advanced than what most people new to programming are ready for.  We have talked about adding to the R lesson a component about working with github from within RStudio, as that takes away some of the complexity, but haven't had a chance to develop that or try it out yet.

Does this approach and modules seem like it matches with what the people there need?

Best,
-Tracy
[cid:image001.jpg at 01D03C73.A417E780]
Hsi-Kai (Kai) Yang<mailto:hky2 at uw.edu>
January 28, 2015 at 5:25 PM
Leah:
You might want to focus on either (1) data preparation/munging/management, or (2) data analysis/presentation. Data exploration probably is in between the two areas.
It could be too aggressive trying to cover all aspects of data science in two days.
Also you might want to assume the attendees can master at least one programming language. I believe learning how to program belongs to software carpentry. Data carpentry is all about data science.
My 2 cent.
Thanks.
-kai

_______________________________________________
dc-discuss mailing list
dc-discuss at lists.idyll.org<mailto:dc-discuss at lists.idyll.org>
http://lists.idyll.org/listinfo/dc-discuss
[cid:image001.jpg at 01D03C73.A417E780]
Leah Wasser<mailto:lwasser at neoninc.org>
January 28, 2015 at 12:09 AM
HI Tracey, and fellow DC participants.
I am looking for some advice. There has been ongoing interest in a Data Carpentry workshop at NEON (where i work for those of you who don't know me). :)

The challenge that I see at this point, is figuring out what content would be most relevant. I posted a survey today and already have a handful of responses - all interested in a 2 day workshop.

However there is a mix of interest in Python vs R. And some mix of background (mostly intermediate focused however).

I am giving everyone a week to respond to the survey. Then I need to figure out an approach. Depending upon the volume of responses, i am even thinking about something that is split across days (R one day, python another). Git and shell combined? OR SQL ? Can anyone help guide me through the logistics of deciding the best approach for this workshop once the survey results are in?

Thank you in advance!!
leah

Leah A. Wasser, Ph.D.
Remote Sensing Ecologist
Senior Science Educator - Universities
National Ecological Observatory Network (NEON)
Boulder, Colorado

________________________________________
From: dc-discuss-bounces at lists.idyll.org<mailto:dc-discuss-bounces at lists.idyll.org> <dc-discuss-bounces at lists.idyll.org><mailto:dc-discuss-bounces at lists.idyll.org> on behalf of dc-discuss-request at lists.idyll.org<mailto:dc-discuss-request at lists.idyll.org> <dc-discuss-request at lists.idyll.org><mailto:dc-discuss-request at lists.idyll.org>
Sent: Tuesday, January 27, 2015 1:00 PM
To: dc-discuss at lists.idyll.org<mailto:dc-discuss at lists.idyll.org>
Subject: dc-discuss Digest, Vol 3, Issue 2

Send dc-discuss mailing list submissions to
dc-discuss at lists.idyll.org<mailto:dc-discuss at lists.idyll.org>

To subscribe or unsubscribe via the World Wide Web, visit
http://lists.idyll.org/listinfo/dc-discuss
or, via email, send a message with subject or body 'help' to
dc-discuss-request at lists.idyll.org<mailto:dc-discuss-request at lists.idyll.org>

You can reach the person managing the list at
dc-discuss-owner at lists.idyll.org<mailto:dc-discuss-owner at lists.idyll.org>

When replying, please edit your Subject line so it is more specific
than "Re: Contents of dc-discuss digest..."


Today's Topics:

1. Data Carpentry Genomics and Assessment hackathon (Tracy Teal)


----------------------------------------------------------------------

Message: 1
Date: Tue, 27 Jan 2015 10:34:30 -0500
From: Tracy Teal <tkteal at datacarpentry.org><mailto:tkteal at datacarpentry.org>
Subject: [data-carpentry-discuss] Data Carpentry Genomics and
Assessment hackathon
To: discuss at datacarpentry.org<mailto:discuss at datacarpentry.org>
Message-ID: <54C7B006.4060605 at datacarpentry.org><mailto:54C7B006.4060605 at datacarpentry.org>
Content-Type: text/plain; charset=windows-1252; format=flowed

If you're working or interested in genomics or assessment, we hope
you'll consider applying for our upcoming Data Carpentry Genomics and
Assessment hackathon. We?re very excited about this event and the
opportunity to develop lessons targeting genomics researchers and build
assessment in to the curriculum. Travel support is available. Please
apply to participate!

Dates: March 23-25, 2015
Location: Cold Spring Harbor Labs, NY

It's a short application and the deadline is this Friday, January 30th.

Call for Participation:
https://docs.google.com/document/d/1r5Bfc-Igt7Hd8kjXsuPw7SenOHkxIQbEDbtnZfAxXbA/pub

Application:
https://docs.google.com/forms/d/17cSQyPIvTIhCQGrFLRoQ0kSway1ZyxgRm9QL85BW8v8/viewform

If you have any questions about the event, please let me know!

Best,
-Tracy



------------------------------

_______________________________________________
dc-discuss mailing list
dc-discuss at lists.idyll.org<mailto:dc-discuss at lists.idyll.org>
http://lists.idyll.org/listinfo/dc-discuss


End of dc-discuss Digest, Vol 3, Issue 2
****************************************

_______________________________________________
dc-discuss mailing list
dc-discuss at lists.idyll.org<mailto:dc-discuss at lists.idyll.org>
http://lists.idyll.org/listinfo/dc-discuss
_______________________________________________
dc-discuss mailing list
dc-discuss at lists.idyll.org<mailto:dc-discuss at lists.idyll.org>
http://lists.idyll.org/listinfo/dc-discuss




_______________________________________________

dc-discuss mailing list

dc-discuss at lists.idyll.org<mailto:dc-discuss at lists.idyll.org>

http://lists.idyll.org/listinfo/dc-discuss



--

Dr. Greg Wilson    | gvwilson at software-carpentry.org<mailto:gvwilson at software-carpentry.org>

Software Carpentry | http://software-carpentry.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/dc-discuss/attachments/20150130/597d67fa/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 770 bytes
Desc: image001.jpg
URL: <http://lists.idyll.org/pipermail/dc-discuss/attachments/20150130/597d67fa/attachment-0001.jpg>


More information about the dc-discuss mailing list