[cse491] hw11 db name loading differences

Alex Nolley nolleyal at msu.edu
Wed Nov 12 19:25:46 PST 2008


I'm having some trouble getting the database to recognize movies/shows that
are the same as actually being the same. This problem is occurring with the
database loading module that Titus gave us. For example, while looking at
the database for a good pair of actors to test, I noticed that Austin, Tony
(II) and Banner, David (I) were both in Def Jam Fight for NY. However, the
db loading script is importing the string 'Def Jam Fight for NY (2004) (VG)
(voice)  [Teck]  <73>'  for Austin, Tony (II) and the separate string 'Def
Jam Fight for NY (2004) (VG)  (voice)  [Himself]  <23>' for Banner, David
(I). Since the strings aren't the same, the database assigns different
movie_id's to them, causing my intersecting searches to turn up nothing.

Should we do some extra processing to remove all the information after the
title? I can imagine doing a split by '(' and then taking the [0] entry, but
what if a movie has '(' in it's title?

 

Thanks,

Alex

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.idyll.org/pipermail/cse491-fall-2008/attachments/20081112/64f23c37/attachment.htm 


More information about the cse491-fall-2008 mailing list