The Shoah Foundation Institute has collected thousands of hours of testimony from survivors of the Holocaust. This footage has been organized and made accessible to scholars and students. Each testimony is broken into 1-minute segments, which are tagged with appropriate keywords. Users are now able to search through the testimonies for segments containing specific keywords. The goal of our project is to make it possible for users to utilize geo-temporal data from testimonies in their exploration of their Visual History Archive.
Related topics are often mentioned in close temporal proximity to each other within a survivor’s testimony. We organized our data as a matrix where rows corresponded to geo-keywords and columns to topical keywords. Each entry in the matrix indicated the number of times a geo-keyword appeared in the same 1-minute segment with a topical keyword. We have calculated a degree of similarity between each pair of geo-keywords and will use this information to improve the search results. Our accomplishments thus far include organization of data, understanding of database, basic statistical exploration, creation of heat maps, and clustering of geographical keywords. We have performed correlation analysis for a degree of similarity vs. geographic distance. This correlation is significant but extremely small. Thus we are unable to provide a justification for suggesting general keywords that are associated with locations geographically close to our original query in search results. However, we have still created a useful algorithm for ranking the results that takes into account common keywords in the geographic region.