[socal-piggies] Seeks guidance: Python 2.7, XML, SQL, ...on GAE

Daniel Stewart danieljstwrt at gmail.com
Mon Jan 9 13:47:27 PST 2012


What I'm hearing, is that I need to invest more time into How I plan
to organize this data, and Where I plan to store it. Seems the thrill
of getting to work on making sense of a few hundred unique strings got
the better of me. So glad I asked, because I left out a few things:

This project, at its most ambitious, will parse the data from not 1,
but 712 different feeds, all structurally consistent, from known URLs;
Divided into 712 geographically unique sub-domains, all from a single
host, spanning 27 different cTLD--at an initial rate of 1x daily.

Archiving this data, while good for the project itself, is not vital
to how it will be used, day in, day out. Once gathered, scrubbed, and
organized, I estimate that this data will need to be stored for a
period of time ranging from about 1 week --> 1 month. Of course, I'll
be able to narrow down that estimate once some data has been
collected.

> Is SQL still even an option for me at this point?

I signed up for a Limited Preview of Google Cloud SQL, sometime last
week. http://code.google.com/apis/sql/

Doubt I'll meet their criteria. . .


Thanks again,

Daniel

On Fri, Jan 6, 2012 at 12:20 PM, <socal-piggies-request at lists.idyll.org> wrote:
>
> Send socal-piggies mailing list submissions to
>        socal-piggies at lists.idyll.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://lists.idyll.org/listinfo/socal-piggies
> or, via email, send a message with subject or body 'help' to
>        socal-piggies-request at lists.idyll.org
>
> You can reach the person managing the list at
>        socal-piggies-owner at lists.idyll.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of socal-piggies digest..."
>
> Today's Topics:
>
>   1. Seeks guidance: Python 2.7, XML, SQL, ...on GAE (Daniel Stewart)
>   2. Re: Seeks guidance: Python 2.7, XML, SQL, ...on GAE
>      (Michael Elkins)
>   3. Re: Seeks guidance: Python 2.7, XML, SQL, ...on GAE (John Matthew)
>   4. Re: Seeks guidance: Python 2.7, XML, SQL, ...on GAE
>      (Michael Elkins)
>   5. Re: Seeks guidance: Python 2.7, XML, SQL, ...on GAE
>      (Grig Gheorghiu)
>
>
> ---------- Forwarded message ----------
> From: Daniel Stewart <danieljstwrt at gmail.com>
> To: socal-piggies at lists.idyll.org
> Cc:
> Date: Thu, 5 Jan 2012 16:45:37 -0800
> Subject: [socal-piggies] Seeks guidance: Python 2.7, XML, SQL, ...on GAE
> Dear PiGGies,
>
>
> January marks the first anniversary of my membership in our SoCal
> group. During this time, I've only begun to scratch the surface of
> what Python is actually capable of (it would appear anything and
> everything). And my imagination has already far outgrown my ability to
> actually actualize it; And for maybe the first time in my life, I've
> encountered an idea worth actualizing.
>
> The countless resources available to beginners, while helpful, have
> left some gaps in my comprehension. This cannot wait for me to catch
> up. I seek your guidance.
>
> I'm currently stuck trying to parse the contents of a single RSS feed,
> from an RDF Schema, iso-8859-1 encoded, XML document, at single known
> URL, using the cElementTree module; To then store this parsed data, in
> a single SQL database, via the MySQLdb library, at an initial rate of
> 1x daily on, wait for it...
>
> Google App Engine (GAE).
>
> ...running Python 2.5; And 2.7--experimentally, apparently.
>
> > Why GAE?
>
> Because some of the data we parse will be used to query the Google
> Places API, in some cases each string will be iterated (itertools?),
> and each iteration queried, at its most reckless. And because the
> thought of having everything housed under one roof gives me a false
> sense of comfort.
>
> In my early attempts, I've already encountered bugs that are
> previously known, well documented, and unique to GAE. This does not
> bode well for the confidence of a new programmer. Perhaps my first
> question should instead be:
>
> > Is GAE the right tool for what I'm trying to do?
>
>
> Thank you,
>
> Daniel Stewart
>
>
>
>
> ---------- Forwarded message ----------
> From: Michael Elkins <me at sigpipe.org>
> To: SoCal Python Interest Group <socal-piggies at lists.idyll.org>
> Cc:
> Date: Thu, 5 Jan 2012 17:38:28 -0800
> Subject: Re: [socal-piggies] Seeks guidance: Python 2.7, XML, SQL, ...on GAE
> On Thu, Jan 05, 2012 at 04:45:37PM -0800, Daniel Stewart wrote:
>>
>> I'm currently stuck trying to parse the contents of a single RSS feed,
>> from an RDF Schema, iso-8859-1 encoded, XML document, at single known
>> URL, using the cElementTree module; To then store this parsed data, in
>> a single SQL database, via the MySQLdb library, at an initial rate of
>> 1x daily on, wait for it...
>>
>> Google App Engine (GAE).
>
>
> You may want to investigate GAE's data store before you proceed.  You will not be able to use mysqldb, because GAE has it's own datastore with a different API.  I haven't looked at the state of open-source reimplementations (see the wiki link below), but one thing to consider is what you will do if you need to migrate off GAE in the future and not be locked in.
>
> http://code.google.com/appengine/docs/python/gettingstarted/usingdatastore.html
>
> http://en.wikipedia.org/wiki/Google_App_Engine#Portability_concerns
>
>
>
>
> ---------- Forwarded message ----------
> From: John Matthew <john at compunique.com>
> To: SoCal Python Interest Group <socal-piggies at lists.idyll.org>
> Cc:
> Date: Thu, 5 Jan 2012 17:47:15 -0800
> Subject: Re: [socal-piggies] Seeks guidance: Python 2.7, XML, SQL, ...on GAE
> Didn't google just add beta mysql?
>
> On Thu, Jan 5, 2012 at 5:38 PM, Michael Elkins <me at sigpipe.org> wrote:
>>
>> On Thu, Jan 05, 2012 at 04:45:37PM -0800, Daniel Stewart wrote:
>>>
>>> I'm currently stuck trying to parse the contents of a single RSS feed,
>>> from an RDF Schema, iso-8859-1 encoded, XML document, at single known
>>> URL, using the cElementTree module; To then store this parsed data, in
>>> a single SQL database, via the MySQLdb library, at an initial rate of
>>> 1x daily on, wait for it...
>>>
>>> Google App Engine (GAE).
>>
>>
>> You may want to investigate GAE's data store before you proceed.  You will not be able to use mysqldb, because GAE has it's own datastore with a different API.  I haven't looked at the state of open-source reimplementations (see the wiki link below), but one thing to consider is what you will do if you need to migrate off GAE in the future and not be locked in.
>>
>> http://code.google.com/appengine/docs/python/gettingstarted/usingdatastore.html
>>
>> http://en.wikipedia.org/wiki/Google_App_Engine#Portability_concerns
>>
>> _______________________________________________
>> socal-piggies mailing list
>> socal-piggies at lists.idyll.org
>> http://lists.idyll.org/listinfo/socal-piggies
>
>
>
>
> ---------- Forwarded message ----------
> From: Michael Elkins <me at sigpipe.org>
> To: SoCal Python Interest Group <socal-piggies at lists.idyll.org>
> Cc:
> Date: Thu, 5 Jan 2012 17:51:47 -0800
> Subject: Re: [socal-piggies] Seeks guidance: Python 2.7, XML, SQL, ...on GAE
> On Thu, Jan 05, 2012 at 05:47:15PM -0800, John Matthew wrote:
>>
>> Didn't google just add beta mysql?
>
>
> I missed that announcement: http://googleappengine.blogspot.com/2011/10/google-cloud-sql-your-database-in-cloud.html
>
> It appears to still be in limited preview, though:
> http://code.google.com/apis/sql/
>
>
>
>
> ---------- Forwarded message ----------
> From: Grig Gheorghiu <grig.gheorghiu at gmail.com>
> To: SoCal Python Interest Group <socal-piggies at lists.idyll.org>
> Cc:
> Date: Fri, 6 Jan 2012 08:40:31 -0800
> Subject: Re: [socal-piggies] Seeks guidance: Python 2.7, XML, SQL, ...on GAE
> I concur with Michael. If you want to use GAE, you might as well use
> their 'infinitely scalable' datastore, which is basically Google's
> BigTable. Why restrict yourself to MySQL? The learning curve may be a
> bit steep but it's worth it.
>
> However, as Michael says, make sure you have a strategy for exporting
> that data somehow. It won't be easily importable into a relational DB,
> but at least you should have it around for disaster recovery purposes.
>
> Grig
>
> On Thu, Jan 5, 2012 at 5:38 PM, Michael Elkins <me at sigpipe.org> wrote:
> > On Thu, Jan 05, 2012 at 04:45:37PM -0800, Daniel Stewart wrote:
> >>
> >> I'm currently stuck trying to parse the contents of a single RSS feed,
> >> from an RDF Schema, iso-8859-1 encoded, XML document, at single known
> >> URL, using the cElementTree module; To then store this parsed data, in
> >> a single SQL database, via the MySQLdb library, at an initial rate of
> >> 1x daily on, wait for it...
> >>
> >> Google App Engine (GAE).
> >
> >
> > You may want to investigate GAE's data store before you proceed.  You will
> > not be able to use mysqldb, because GAE has it's own datastore with a
> > different API.  I haven't looked at the state of open-source
> > reimplementations (see the wiki link below), but one thing to consider is
> > what you will do if you need to migrate off GAE in the future and not be
> > locked in.
> >
> > http://code.google.com/appengine/docs/python/gettingstarted/usingdatastore.html
> >
> > http://en.wikipedia.org/wiki/Google_App_Engine#Portability_concerns
> >
> >
> > _______________________________________________
> > socal-piggies mailing list
> > socal-piggies at lists.idyll.org
> > http://lists.idyll.org/listinfo/socal-piggies
>
>
>
> _______________________________________________
> socal-piggies mailing list
> socal-piggies at lists.idyll.org
> http://lists.idyll.org/listinfo/socal-piggies
>



More information about the socal-piggies mailing list