This is the second visualization from the project, showing the results of several natural language processing analyses of the original texts. It plots the language patterns embedded in 232,567 pages of historical Texas newspapers, as they evolved over time and space. For any date range and location, you can browse the most common [...]]]>
This is the second visualization from the project, showing the results of several natural language processing analyses of the original texts. It plots the language patterns embedded in 232,567 pages of historical Texas newspapers, as they evolved over time and space. For any date range and location, you can browse the most common words (word counts), named entities (people, places, etc), and highly correlated words (topic models).
See the visualization at language.mappingtexts.org »
]]>And so we have been experimenting with topic modeling for this project, concentrating on the popular MALLET software package. We recently presented a paper based on this work at the meeting of the Association for Computational Linguistics in June 2011, “Topic Modeling on Historical Newspapers.”
From Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (2011), pp. 96-104.
]]>This visualization plots the quantity and quality of 232,567 pages of historical Texas newspapers, as they spread out over time and space. The graphs plot the overall quantity of information available by year and the quality of the corpus (by comparing the number of words we can recognize to the total number scanned). The [...]]]>
This visualization plots the quantity and quality of 232,567 pages of historical Texas newspapers, as they spread out over time and space. The graphs plot the overall quantity of information available by year and the quality of the corpus (by comparing the number of words we can recognize to the total number scanned). The map shows the geography of the collection, grouping all newspapers by their publication city, and can show both the quantity and quality of the newspapers from various locations. Clicking on a particular city will provide a detailed view of the individual newspapers, where you can examine both the quantity and quality of information. A timeline of historical events related to Texas is also available for context.
]]>