Digital History

By Mark Ciotola

First published on August 8, 2019. Last updated on February 15, 2020.

Digital History

By Mark Ciotola

First published on August 8, 2019. Last updated on February 15, 2020.

An introduction to digital history technologies, from the well-established such as document preservation and geographic information systems, to the exotic and future-oriented such as artificial intelligence and virtual reality.

Preface

First published on February 15, 2020. Last updated on June 10, 2024.

This course introduces digital history technologies, from the well-established such as document preservation and geographic information systems, to the exotic and future-oriented such as artificial intelligence and virtual reality. This course is broken up into a series of lessons, with material presented to introduce each topic, discuss how it is useful to historians, and demonstrate how to begin using it. There may be activities to learn and gain practice with the technology.

Each person has a different level of experience and comfort with technologies. Further, each person has different areas of familiarity with particular technologies, regardless of their general level of technical know how. So this course covers technology basics, but there are also some Leveling Up activities that are more challenging that serve as the entrance to delving deeper into these areas.

There will be Recommended Reading to help you learn about particular support technologies. If you already know those areas, feel free to skip those readings. Finally, there may be Resources and Further Reading, to help you learn more about the topic or complete the activities.

Working with technology is not completely easy for anyone. Even the most savvy programmers spend considerable time web searching for more information or solutions when things don’t work. While it is best to start learning from curated materials such as those on this site, web searches can help fill the gaps.

Finally, remember that you don’t have to be able to figure out everything in this course to benefit from it. Mere familiarity with what these technologies are and why they are important can start bringing you benefits from them and help you shape your strategy for using digital technologies. You can then later focus on what you really need to learn to accomplish your objectives.

1 What Is Digital Technology and How Historians Can Use It?

First published on August 24, 2019. Last updated on June 10, 2024.

Learning Objectives

Introduction to digital history and its context within the discipline of history.
An introduction to the topics the course will cover.
How to prepare for the remainder of the course.

What Is Digital History?

What is digital technology and how can historians use it? The term digital is derived from the Latin digitus, meaning finger or toe. Most people have ten fingers available to count all the way from zero to ten. Most computer hardware literally only have hands with only one finger, so they can only count from zero to one. Nevertheless, computers can process lots of zeros and ones very quickly. Larger numbers, letters, images, videos and much more can be represented and processed as groups of ones and zeros. Digital history can be said to encompass all historical endeavors and works that involve such technology. That is a really broad definition. In practice, historians typically focus on a set of tools and technologies.

Examples of Digital Tools

With the advent of powerful computers and other technologies, there are many more tools for accessing existing historical sources, further evidence-gathering and the analysis of history. There are numerous tools used for digital history.

Such tools include those long used for the recording and manipulation of digital information:

pencil on paper
abacuses
clay tablets (which can be sometimes be changed with a little bit of water)

Common desktop software can be used for digital history, either to record or examine documents, act as a simple database or for communication and visualization:

Word processors such as MS Word and Apple Pages
Spreadsheets such as MS Excel and Apple Numbers
PDFs tools such as Adobe Acrobat Pro or PDF Expert.

There are reference tools such as:

Clio (US historical reference site)
Mathematical and quantitative fact-finding tools such as Wolfram Alpha

Programming allow you to write your own tools. Examples of programming languages include:

Perl
Ruby
Python
Statistical languages and plotting software such as R
Graphical languages such as Processing and SVG

There are also specialized application platforms:

Geographical Information Systems (GIS) such as QGIS and ArcGIS.
Omeka for sharing digital collections and creating media-rich online exhibits.

Resources

Alliance of Digital Humanities Organizations
Association for Computers and the Humanities
Digital Humanities Course Register
Corporation for Digital Scholarship (nonprofit) offers tools including Omeka, Zotero and Propy.
Quinn Dombrowski (UC Berkeley) and Jody Perkins (Miami University in Ohio), Article about TaDiRAH – Taxonomy of Digital Research Activities in the Humanities.
Digital Research Tools (DiRT) Directory (updated to 2012; later versions were not working when checked)
NEH Office of Digital Humanities
Programming Historian

1.1 Introduction To Digital History Activities

First published on February 15, 2020. Last updated on February 15, 2020.

Activities

Refresh your basic computer skills, by accessing such as using your university accounts, and web applications such as Google Docs, text editors, etc. (Instructor will make an assessment of the “digital readiness” of the students).
Discuss what you envision that digital history is and is not.

Leveling Up

Try out the Ruby programming language, which is a very easy, forgiving language to understand the high level concepts of programming.

2 Becoming A Digital Explorer of the Online World of History

First published on August 24, 2019. Last updated on June 10, 2024.

Objectives

Existing online historical sites and tools will be explored.
Students will apply what they have learned in other history courses to critique the validity and relevance of online historical materials.

Searching For Online Information

Digital history allows you to become a digital explorer of the online world of history. Performing a web search on most historical terms will provide you with plenty of information. Some online sources are better than others. As a history student, you should know how to critically evaluate sources. Who provided the information? Is it claimed to be fact, opinion or speculation? What is the authority of the source? How would they know? Are they biased? Is the document by whom it purports to be? (Forged documents and fake sources still exist, as they always have.)

Wikipedia requires special mention. It is a great way to become introduced to historical topics and see related topics. However, you don’t know who wrote it or whether is it true. Also, key information might be omitted. So while Wikipedia might be your first stop to find out about a historical subject, it certainly should not be your last. It is better to have a source whose author is identified and who takes responsibility for the information proffered.

Google Maps is sometimes a good way to investigate what historical sites look like recently. There weren’t satellites before 1957, so there can’t have been any satellite imagery before then. Many historical sites do not have Google Street View coverage. However, there were aerial views and street level photos taken since the 1800s, and some of those may be found online. A Google image search can sometimes be more efficient than a text search.

Types of Online Resources

There are several types of digital resources for serious scholars. Some were originally in hard copy form and digitalized while others have always “lived” online.

Peer-Reviewed Articles

An important type of resource for scholars are articles in peer-reviewed journals. Scholars will read through many articles. Most of those articles cost money to obtain, unless you access them though a school or university. The breadth of access to materials varies tremendously among different schools and universities, but this should still be your first stop if available.

If you know the journal and issue number, you can go directly to the journal listing. Otherwise, you will need to search for the article via the university library’s search services. Many universities offer a simple one-field search interface which will often work by entering in an author and year or title. However, if your author has a common name, you may need to add more specific identifiers, such as an exact title, co-author names, publisher, etc.

If your library’s search service locates the article, then you will need to get the article. Your library search results will (hopefully) offer suggestions (links) for where to obtain the article. Be aware that different subscriptions and services may contain different date ranges for particular journals, so you may have to check several.

Being a subscriber to a journal or a member of the journal’s host society can be a good way to access an article without paying extra for it, but sometimes university search services provide faster access.

Books

Many books are online. Searching for books generally works the same way, unless your institution does not offer a simple search interface. In that case, search through the university’s online book catalog. Here, the results will most often contain digital downloads or digital access to the books via your university. Your library can sometimes get hard copies of books itself does not possess.

Dissertations and Masters Theses

Dissertations and masters theses can often be valuable sources. Access to such is much less consistent than access to major journals. Some libraries provide digital access while others do not.

Hard-To-Find Materials

If your library does not have direct access to a hard or digitalized copy of materials, sometimes they can request a hard or digitized copy of such from other libraries.

Primary Sources

Libraries often contain special collections of unique, original materials, such as letters, records and maps. Many libraries have digitalized some of all of those collections and they may be available online or by request. How do you find such materials, which are scattered across the globe? First try your library’s simple search (e.g. OneSearch) and second, try a general web search.

There are two types of information that may show up on searches: contents of the original source and metadata. Items such as titles and authors are examples of metadata. Since these may be such unique and scarce items, make sure to figure out which terms to search by, and be prepared to make many searches on specific metadata terms or fragments of original content.

More recent primary sources, especially for data, have been digitized and can be found online in government and industry sites.

Computer Security and Data Protection

Before going further, now is a good time to mention that it is a dangerous world out there! (It’s sometimes inside as well.)

You should ensure that you are maintaining good practices when it come to your computer and its data.

You should regularly up your entire computer. What this means it to copy your hard drive onto external medium such as a back-up drive. There are specific utilities for this. Ideally, you will periodically back up the back-up drive and place it in a separate safe location. (Or at least make copied of especially valuable files and keep them in a separate safe place). There are utilities that make this easier. Don’t forget about files on your phone or pad. If you back up or computer to network locations such as “the cloud” be aware that your data may be accessible to other parties even if is is “protected” by a password.
You should use difficult-to-guess passwords. You should keep those passwords recorded in a safe place. There are password utilities that can help. Do not use the same password for everything.
Hackers even go after routers. If you use a router, make certain that the password has been changed from a standard password such as “password”.
If creating a website or web application, follow safety protocols for your language or framework.
Don’t collect personal data from your users unless you absolutely need it, and be extra careful with it all. Be aware of regulation concerning your collection and retention of data. Read about the European General Data Protection Regulation (GDPR).

Resources

Ancient History Encyclopedia (a general history site with some emphasis on the ancient)
Omeka showcase of digital exhibits
WorldCat (combines metadata from many libraries to make it easier to find sources)
Zotero is a platform for collecting and organizing sources

2.1 Becoming A Digital Explorer Activities

First published on February 15, 2020. Last updated on February 20, 2021.

Activities

Students will try out online tools such Clio to retrieve information of historical interest.
Students will identify and explore online sources and databases to which they have special access or interest.

Clio Online Web Tool

Choose a place of historical interest in the United States (because the tool only has USA locations; you will have to opportunity to examine other countries in future activities.)
Open up the Clio website in a browser window.
Enter a location. You will (hopefully) be presented with items of historical interest, or resources for historical investigations.
Write a paragraph on three locations, describing what they are, and why they are of particular interest, and they type of information you might expect to gain if visiting there.

Restricted Online Resources Available to You

You are often a member of a community that may have special access to online resources. If you are a student or instructor, your school may have special, free access to online databases and materials that might not be available to the public or is behind paywalls. Likewise, if you are a resident in a local community, your local library or museum might provide free access to digital materials for you.

Some employers mat provide special access as well. Also, some investment platforms for stocks or retirement funds provide special access to proprietary research materials and other informative media. Further, if you re a student, sometimes you can get free or inexpensive access and subscriptions to materials.

Identify your roles and memberships. Examine websites related to those roles for special access. If you want more than the website mentions, contact persons at relevant organizations via email or telephone to inquire if they offer additional resources.
Try to access one of more of those resources.

2.2 Level Up: Creating Web pages

First published on . Last updated on February 18, 2020.

Activity Objectives

You are viewing this course via a web browser. Each lesson is a web page you can view in a browser. The best way to learn how web pages work and how they structure information is to begin creating your own webpages from scratch (rather than using an authoring platform).

Students will create their own web page using either simple HTML and CSS.
Students will set up their own online web site using either a university account, WordPress, Wix or another tool.
Students will try out more advanced text editors.

Setting Up A Website

A website is made up of web pages. The most common type of web page is a hyper text mark up “html” file. Such a file contains a header section with information about the page, such as its title, and a body section with the contents of the page. Information is contained between an opening tag such as <title> and a closing tab with a backslash, such as </title>. Some pairs of tags can be nested within other pairs of tags.

Create your own web page by following the below steps.

1. Open up a text editor. Notepad or TextEdit can work, but make certain that you use plain text mode.

2. Type the following into your editor:

<html>

  <head>
    <title>My My Clio Results<title>
  </head>

  <body>
    <h1>My Clio Result</h1>
    <p>I found several fascinating items in Clio.<p>
  <body>

</html>

3. Save the the file and name it “your_surname_firstname_clio_results.html”.

4. Open the file in a web browser. See a page with “My Clio Result” as a large headline and then your factoid beneath.

5. After reading the information below, enter your write-up of your Clio results into the body section of your html page then resave and view it (and turn it in to your instructor if requested).

What is this all about? HTML stands for HyperTest Markup Language which is the standard way to express web pages. The extension HTML tells the browser that the file should be interpreted as HTML language.

The page has various layers inside of other layers. The highest layer is the html layer, hence the html tags at the beginning and end of the document. They enclose and apply to everything in this document.

Then there is the head part. We use it to enclose the page title, which may show up in browser tabs. The head can also be used to contain information which is useful for the rest of the document, such as scripts and styles. Note: the contents of the head section typically do not appear in the contents of a page.

Next is the body part. This contains the content that you wish people to view in the web page. h1 tells the browser to format this text as the largest standard heading size. There are also h2, h3, h4, h5 and h6 sizes. The p tells the browser to format the enclosed text as a separate paragraph.

Remember that most tags require the content to be terminated with an enclosing tag, which is typically the beginning tag preceded by a “/”, such as:

<p>I found several fascinating items in Clio.</p>

Styling Content With CSS

CSS stands for Cascading Style Sheets. You can use snippets of CSS code to add styling to your content (or use entire style sheets).

The easiest way to style with CSS is to add CSS code to an html element. This is called inline styling. For example, you can add background color are graying to a paragraph to set it apart. The elements of CSS styling are the aspect to be styled, such as background color, and the style itself such as a color. In CSS, colors can be expressed in several different way, such as via common names such as red, blue and yellow or via codes for more precise shading, such as #ffcc99;

<p style="background-color: yellow;"> Rome was not built in a day, and neither was Syracuse.</p>

You can do much with styling, and style sheets can be used to easily style who groups of elements, such as all paragraphs or headings. To improve accessibility and responsive design, it is best to separate content from styling.

Hosting Your Page for Public Access

Your school might provide an online location for you to host web pages for public access. If so, your school might also provide instructions and recommend software to transfer your files online. Otherwise, you can pay a small amount monthly for a hosting services. There are also hosts, such as WIX and WordPress.com, that provide basic authoring environments for free (warning: their offerings are not necessarily endorsed and their terms may change). These environments are relatively easy to use and maintain, and often offers extra features, but their capabilities may nevertheless be limited.

More Advanced Text Editors

Leveling up activities are optional for their section, but they can involve skills and tools that can help you get ahead of the game, or do things (once learned) better or faster.

There are text editors created specifically for editing computer programs and code. They offer a lot of helpful features, but each one has different capabilities and look and feel. You should try out a few and pick the one you like best. Some of these are free, while others cost money.

Some coders prefer command line utilities, but only try these if you feel comfortable with the Unix/Linux command line. Examples include Vi, Vim, Pico, Emacs. (This author sometimes still uses Pico. It can be useful in a pinch.)

3 Document Preservation and Retrieval—Saving Old Information With New Technology

First published on August 24, 2019. Last updated on June 6, 2024.

Objectives

Learn about using digital technologies to preserve documents, photos and other 2D materials.
Learn about the various file types, and how to store, organize and protect digital archives.

Traditional Means of Document Preservation

In the old days, the work of historians was much like that of the Indiana Jones character: seeking ancient manuscripts and other works, sometimes traveling across the globe. Some historians still must do so, and other prefer to do so, but now there is a tremendous amount of historical information available through the internet, if you know how to locate it. However, documents don’t put themselves on the internet. Saving old physical works is another major component of digital history. This lesson concerns document preservation and retrieval, literally saving old information with new technology.

There were several ancient means of recording historical events. The earliest may have been storytelling and songs to help people learn and remember about past experiences and event sin the society. Cave paintings may have been another early means. When language began to be written (recorded in an external physical form), clay tablets, stone engraving and scrolls of papyrus paper were used. Eventually other forms of paper were used as well, and sheets were combined into books. While inventions such as the printing press resulted in the proliferation of written works, such technologies were mere improvements of the same approach.

Phonographs, film and magnetic materials in the nineteenth and twentieth centuries finally made breakthroughs in recording events and other information. Recordings of audio and visual events could be made so that future persons could experience direct sensory perceptions of those events rather than reading about them. Further, historical documents could be stored in film version and retrieved and photocopied at will. Yet these technologies were not digital.

Saving Old Information With New Technology

Digital technology allows for the preservation, storage and retrieval of historical documents. Technology has allowed such for thousands of years, so what is special about digital technology?

The term digital refers to recording and processing information ultimately as strings of numbers, ultimately as binary numbers being 0 and 1. This allows that information to be processed by computers. Computers are fast, and the information within them can be transmitted and transformed with relative ease. That means that historical recordings can be reproduced instantly. Documents can be retrieved quickly and searches can be performed easily across millions of documents to search for names, places and terms. Of course the term easy is a relative one, as compared to such searches without digital technology. Searching can still require thinking and skill, but the ratio of brain work to mere mechanical, manual activities has increased significantly.

Digital records often begin life in image form. Often they are then processed using optical character recognition (OCR) and either include a text “layer” or are converted into a text document.

Digital records and archives

Let’s first discuss information that is already present on the internet or in other digital forms. There are several important aspects of digital information:

What it is?
In what form it is?
Where it is located?
How to access it?

Is there one answer to rule them all? No! As a historian, you may be confronted with a tremendous variety of answers to these questions. Some sources will be on floppy disks, CDs or even magnetic tapes. Some will be behind paywalls. Some will be in file formats which modern computers cannot read. You can savor the exotic possibilities later. For now, the most common cases will be covered.

Images of Primary Source Documents

A three thousand year old clay tablet can be converted into digital form by simply photographing it. The image will be in a file. If you have technology that can access, read and display the image, then you can see much of the information contained by that ancient tablet. What might be even better is if the contents of the tablet are searchable, in the form of a text file. (Sometimes they will be, sometimes they won’t).

Examples of images include scans of documents and photographs. Sometimes there might be other representations, such as vector graphics.

Text-Based Documents of Primary Source

Primary sources in digital form can be original digital documents (such as notes from a meeting typed on a word processor) or in indirect digital form such as a typed up copy of a newspaper article. Text-based documents are ultimately in files.

Secondary Sources

Older secondary sources may be processed in the same manner as old primary documents. Newer secondary sources are probably already natively in a digital form and can be found by an ordinary web or database search. Unfortunately, many of these sources are behind a paywall. If you don’t want to pay, go through a library. If your university supports OneSearch, this is the easiest way to start looking. Otherwise, your library may have research guides concerning which materials it has available and how to access them. (Each library is somewhat different.)

File Types

Text Files

There are several common forms of text files.

Pure text files end with “.txt”. They only contain text and neither other types of content nor formatting information. This form is generally easy to read by humans, and suitable for computer program code. Sometimes these are called plain text files.
Rich format files contain some formatting information and end with “.rtf”. These are generally not suitable for computer programs.
Some older word processors read and generate files that end with ” .doc”. These can contain formatting information, images and other types of content.
Some newer word processors generate files that end with “.docx”. They are similar to .doc files in terms of content and information, but have a significantly different file format.
Comma separated value files have values separated by commas. Strictly speaking, these are text files, but in a form that are readable by databases, spreadsheets and other specialized software. Collections of public and historical records are often exported in this format. They often end with “.csv”.
Structured Information files are similar to database records, and may be considerably more complex than a simple .csv file. They may end with “.xml”.

Archival Files

Archival files may contain primarily text, but they often contain additional elements such as formatting. The Portable Document File (PDF) format can preserve formatting, but it has several possible deficiencies that make retrieving the document in its original for, problematic. However, the PDF/A is a preferred archival file format. It attempts to maintain device independence, self-containment and self-documentation. For example, it requires embedded fonts rather than linked fonts, in case the linked source is no longer available. The PDF/A format prohibits the inclusion of audio, video, Javascript and executable content. So this format may be suitable for traditional print media but not for your favorite video game.

There are several versions of PDF/A files, such as PDF/A-1, PDF/A-2, and PDF/A-3, with the higher numbers allowing for embedding more advanced content such as richer graphics. (If interactivity is required, the PDF/E format might be considered, albeit at the loss of some portability.)

Image Files

There are several common types of image files:

.gif—good for illustrations
.jpg—good for photographs, compressed format to save disk space and load faster
.png—good for illustrations and photographs but may not be supported by all platforms.

Audio-Video Files

There are several common types of audio and video files:

.mp3—a common audio file. Compressed.
.mp4—most common video file. Compressed.
webm—a popular alternative video format
.mov—a video format used by some Apple applications
.flv—a Flash video format, but Flash may have security issues

Digital Preservation Technology

Although there are many physical ways to help preserve physical documents and artifacts, this discussion will focus on digital technologies. It is possible to merely collect information about an object, such as radar scanning of a large archeological site, or to actually reproduce essential quantities of an object. The most common technologies involve some form of scanning, which is generally non-invasive, minimizing the possibilities of damaging the object.

An early form of scanning was a variation of that of scribes who visually copied documents, except that a human would enter the text contents of a document into a work processor document. Yes, it was slow, but provided steady paid work for some graduate students. The more modern method is to simply scan documents using a photocopier, fax machine or dedicated scanning machine. Advanced scanning machines can automatically page through an entire book, although this is not recommended for rare or fragile works. Some advanced scanning stations feature very high quality cameras.

Desk with glass and metal scanner set up in V shape.

Book scanning work station (credit: Jason “Textfiles” Scott. CC BY 2.0)

Then the digital image file resulting from the scan is often run through software that recognizes characters (optical character recognition or OCR) and can separate the images into their own files. OCR works pretty well for clean text of common fonts of modern English, but extra steps will probably be required for nearly anything else.

3D scanning and printing can be used for 3D objects, with the caveat that most 3D scanners are not very large.

Resources, Platforms & Services

Archivematica. A web- and standards-based, open-source application which strives to allow institutions to preserve long-term access to trustworthy, authentic and reliable digital content.
Axaem. A records life-cycle management system for records managers and archivists
CONTENTdm. A tool to build and showcase digital collections on personalized websites.
JHOVE. A file format identification, validation and characterization tool, useful for files such as PDFs.
LOCKSS. Services and open-source technologies for high-confidence, resilient, secure digital preservation. Strives to provide a reliable mechanism for long-term digital integrity assurance and access.
Omeka software for sharing digital collections and creating media-rich online exhibits
Portico. Strives to provide libraries and publishers with reliable preservation of electronic resources, and expertise and technical assistance to national libraries, so that that their content will be accessible to researchers, scholars, and students in the future.
Rosetta. Digital asset management and preservation solution for libraries, archives, museums and other institutions.

3.1 Document Preservation and Retrieval Activities

First published on February 15, 2020. Last updated on June 11, 2024.

Activity Objective

Students will gain experience locating and retrieving historical documents.

Document Retrieval and Comprehension

The English constitution is thus far unwritten. However, an early document of rights was the Magna Carta. While English King Richard the Lionhearted was gallivanting around continental Europe and crusading in the Middle East, his brother Prince (then King) John (of Robinhood fame) was oppressing the peasants, and even worse for him, the nobility. The nobility rebelled and forced Prince John to sign the Magna Carta, which is the closest thing there is to a written English constitution, except it’s not.

A comic image of a tall thin lion in king's robes

An absurd American rendering of Prince John (credit: Disney, 1973)

Nevertheless, the Magna Carta is a fine example for this activity, because it’s the closest thing to an easy medieval document, except it’s not. However, it shares several common traits of old digitalized documents.

This being an English document, let’s start this adventure at the Britsh Library. If you are clever, you can grab a copy of a photograph of the Magna Carta from that site. I have done so here:

A page of parchment with writing to small to read

A photograph of the Magna Carta (2015)

What can you determine from this photo? What can’t you determine? Here is a link to another original at UK’s National Archives site. Does that help much? Can you recognize the characters? Can you identify and understand the words?

Fortunately, that site had another photograph of a different version of the Magna Carta. Different version? Yes, although the original Magna Carta was issued in 1215, a 1216, 1217 and 1225 versions were issued. Fortunately, the 1225 version is more readable. Look at it for yourself (use the magnifier): 1225 Magna Carta. Can you recognize the characters?

Here are additional links and tools by the Society of Antiquaries of London. You can see a text versus image comparison. Can you identify and understand the words now? Perhaps only if you know Latin. Such is the case with many ancient documents. Scanning and digitizing them is often not enough by itself.

Further, in pure scans, you cannot simply copy and paste the characters into Google Translate, because the characters have not yet been recognized or transcribed into computer characters. You could manually transcribe the document, or create an artificial intelligence application to recognize characters in an old scribe style.

Many famous documents have been translated. You have to trust the translator and know the context of the document, but it mat suffice to get you started with the document. The Magna Carta is famous and has been translated. Or see a US government archives translation. You should note and critically evaluate the source of the translation.

Try going through this process with another famous document from pre-industrial times.

4 Digging Deeper into Document Repositories

First published on August 24, 2019. Last updated on February 18, 2020.

Activity Objectives

Students will learn methods for digging deeper into more examples and types of digital document and record repositories.

Additional Repositories

There are some obvious digital repositories that show up with a basic web search. However, there are additional types of digital repositories that will be of value that might require deeper searching or directly contacting the source institution.

Some smaller organizations have collections that are digitalized, but may not be online. These may be on disks (CDs, DVDs, etc.), magnetic tapes, or other older digital media. Specialized reading devices might not be commonly available, but the institution itself might still have a working device. At least punch cards and paper hole tape are less common these days, but not entirely extinct.

Needless to say, you might have to go to the physical site to access these materials. Fortunately, libraries will provide short term fellowships if the site is far away or you will require a few months to go through the materials. Or if you know exactly what you need, the institution might make a copy of the desired materials. (First, ask your home library if they can request a copy, because there are often inter-library agreements and consortia that facilitate such requests.)

University Libraries

University libraries are an obvious source, and they can often get you past paywalls for digital journal articles and books. They may also have online research help guides for particular subjects. Often universities have their own collections of digitized primary sources, especially for regional records, events, newspapers and photographs or valuable donated collections. Many items in such collections may be unique. You might need student, faculty or staff status to access those materials, even if you pay for a “friends of” membership.

Public and Community Libraries

Community libraries often have their own sources of local original materials, and their collections often contain some rare original sources. Community libraries can sometimes be a way to get past paywalls or obtain books from university libraries (such as through the Link+ service). The collections and services of community libraries can vary tremendously.

Historical Societies

Historical societies have original local and regional historical sources which might encompass much more than you would expect. These societies have often digitized at least part of their collection. Example: California Historical Society digital collections. Check to see if you have a special status that can get you enhanced access.

Museums

Obviously, museums often have extensive collections of original sources (albeit often in artifact form). What is less known is that some museums also operate as research institutes and may have extensive libraries and subject files. You may have to get special permission to access those materials or even to see their collection listings, so be prepared to make a specific request and to justify it. It may be a challenge to locate such libraries, so keep trying. They often don’t really want the general public to know of their existence.

Journals

Journals typically contain secondary sources, but may also have partial or full reproductions of original sources. Some journals, issues and articles might be behind paywalls, but sometimes those articles are also available though free sources.

Newspapers

Some newspapers keep files of historical information on topics of interest. They may keep images of old issues online or on microfilm. Google News often does not go back very far, so you may have to search directly on a newspaper’s site for stories long ago. (Sometimes a regular web search works better for finding old news than Google News.)

Comprehensive

There are comprehensive digital archives sources, such as the Internet Archive, “a non-profit library of millions of free books, movies, software, music, websites, and more.”

Resources

Bibtex is a reference management system for users of LaTeX.
Endnote is a general program for storing reference information.
Master’s program, Humanités numériques et computationnelles (a short video with English subtitles)
Master’s program, Digital Humanities (video), University College London.
SQLite database system to create, edit and store tables and records.
Soir is a search engine for your site (requires experienced developers to set up.)

4.1 Digging Deeper into Document Repositories Activities

First published on February 15, 2020. Last updated on February 18, 2020.

Activity Objectives

Students will perform queries on actual historical databases.
Students will learn how to gain enhanced access to online repositories of historical documents and information.

Text parsing and processing

Once you find relevant documents, you might need a faster way to search for content of particular interest than reading everything. Text parsing is a way to search for particular terms or fragments. It can get much more sophisticated than simply typing a search term. Parsing involves searching and sometimes changing text in an automated manner. For example, one may wish to search a collection of ancient documents for a particular person’s name, for a certain period, while omitting another person’s name.

Parsing requires the text to be in a computer readable form. Optical Character Recognition (OCR) software can concert images containing text into searchable text documents. There are many considerations required in parsing. For example, are there different spellings of that person’s name? Is the capitalization of the name inconsistent? Is that person known my nicknames or abbreviations?

There are many tools for parsing. The most common is the simple find, or find & replace, command in word processors, text editors, and many other applications. So you do not necessarily need to write your own program for this. However, you may have to become skilled at writing expressions to find exactly what you want.

4.2 Level Up: Perl and Workflows

First published on . Last updated on June 12, 2024.

Level Up

Students will write a brief PERL program to parse a sample document.

Parsing a block of text means to find one or more characters, and flag that group or change it. This is a very important skill in both literature research and professional programming.

It is easy enough to search for a short group of characters (also called a “string”) in a word processing document. Word processors often make it easy. However, sometimes you will need to do a more complicated search or efficiently go through many documents.

Let’s examine an example. Say you were looking for all references to the name Jean Doe in a large collection of digitized letters and public records. Here is a simple way to do it:

Open file
Search for “Jean Doe”, and mark position of each find.
Close document.
Repeat until all documents have been searched.
Export report of all found instances.

Easy enough, kind of. Except that names often get misspelled or translated.

Jean could be spelled as “Gene” or translated as Jeanne or John. So you might have to search for those and similar cases as well. Or there might be spaces or hyphenations in the middle of the name, so you might also have to search for “Je an”. Or what if you are looking for Jean Doe only written as a stylized signature. Then you might have to run an image recognition search. You can only do so much, and the importance of what you need to find and your available resources will dictate your level of effort. However, this is certainly not an exact science!

PERL is an older computer language, but it is good for searching for patterns in text. Use the short course Perl Programming Language to become familiar with PERL basics.

5 Databases

First published on September 23, 2019. Last updated on June 12, 2024.

Database Concepts

Databases are computerized systems that contain data. Database systems comprise the database application (a program) and data storage.

There are several important database concepts.

A field is a container for an item of information. It is analogous to a variable in a program in that its contents can change.
A record is collection of fields that comprise a set of data associated.
A record may contain one or more fields, each called a key, that can be used to identify the record. At least one key should be unique for that record, which is then called the primary key.

In traditional databases, data is organized into tables. A database table has a special structure. Typically a database will comprise rows and columns. A row comprises a record. Each column represents a field. All of the records in a table will have all of the same fields, although the contents of a field may vary among records. Below is a table with three records and four fields.

University Name	City	State	Country
Oregon State University	Bend	Oregon	USA
San Francisco State University	San Francisco	California	USA
Warwick University	Warwick		UK

Spreadsheets resemble database tables. If used where one row strictly corresponds to one record and if columns are used consistently, then a spreadsheet table can be used as a database and it can sometimes be directly imported into a database system. However, many people do not use spreadsheets so strictly, so that importing them into a database often would cause havoc.

Relationships

It is possible to relate tables with each other. For example, a table of students and a table of course enrollments might be related by student ID #. This provides the ability to pull information from one table into another table to provide a richer set of information. Some database systems might have many related tables involving multiple relationships. For example, in the table below, the Recipe Steps table is related to the Products, Process and Resources tables to bring together a lot of different information involved in determining the energy or greenhouse gasses involved in manufacturing a product.

Entity-Relationship Diagram

Reasons to Use Databases

Databases have several capabilities that add value to historical work. Databases are searchable. Unlike a document, it is possible to search one field at a time in a database. Database records can be sorted, usually by any of the fields. So if you have a contacts database, you could sort it by date entered, surname, or state. You can also perform multi-field searches, and sort the results first by one field and then another.

It is possible to relate one database table to another table. In that case the database is called a relational database.

Although a comma-separated value (CSV) field is not itself a database, it can contain the data from a database, where each field is separated by a comma and each record by a line return (line feed).

The ability to search databases is really important. After all, what’s the point of storing data if you can’t find and retrieve the data you require? Not much! There is a special language for searching databases called Structured Query Language, or SQL for short (pronounced “sequel” or S-Q-L).

Database Platforms

Endnote and BibTeX are essentially database applications to manage bibliographies, citations and references.

SQLite is a relatively simple desktop database system. It can store data in proper tables and can execute SQL queries.

SQLite is a desktop database that can create databases and perform SQL queries. It is relatively simple to set up, even though its interface can be confusing for novices.

MySQL is a popular open source SQL database. It typically requires a command line, a program or web application to access.

Other professional databases include Oracle, Filemaker and Access. Oracle is the leading “corporate” database platform. It is expensive! Airtable is a cloud-based database platform.

Custom-Developed Databases

Many companies have changed from their own custom databases to specific web applications that offer database features for a set of specific uses such as monitoring customer relations (Siebel, Salesforce), project management, running manufacturing (SAP) and tracking human resources (Peoplesoft). These platform typically claim they can handle most many types of operations for many different types of companies, but usually they are better for some uses and companies than others.

There has also been a movement from companies hosting applications that they purchase to simply renting applications over the web that are hosted by an outside service provider. This is called Software-As-A-Service (SaaS). It can lower costs in the short run and make set-up and maintenance much easier, but it means you are locked into paying if you want to keep using the system (versus a one-time payment for traditional applications) and also the vendor will have possession of your data.

Analysis: Databases Versus Parsing Programs

Queries upon data can be performed using both databases and parsing programs (such as those written in Perl). So what are why use one over the other? A database structures information better, such as into fields, each with their own data types. So when you do a search on a numerical field, you generally know that you are dealing with numerical data. You can also do field-specific searches.

However, databases often require more upfront work to accomplish that structuring. In addition to setting up the database itself, one must accurately enter the data into the database, field by field. Can’t the data entry process be automated to allow for bulk entry of documents and records and auto-population of specific fields? Yes, if the data has a consistent, preexisting structure. Often business and trade records will be recorded using a regular structure, so that a database script will know to look for a date in the first several characters of each line, a financial amount at the end of each line, and some descriptive information in between. However, a lot of data will not possess such preexisting structure and consistency.

A parsing program can search an entire document for something that looks like a year or financial amount and look for variations of words (assume that any four plus letter word that ends with “son” or “sen” is a surname). Also, parsing programs often run more efficiently than databases (which have a lot of “overhead”), so may be able to perform faster searches on large quantities of data.

5.1 Database Activities

First published on February 15, 2020. Last updated on June 12, 2024.

Activity Objectives

Students will learn about databases. They will review spreadsheets as a metaphor for data tables and relational databases.
Students will create a database using a simple tool such as SQLite and learn how to perform queries.

Activity

We will use MySQL to create a database. You can use the Tutorialspoint MySQL online terminal to enter and run the code. Replace the existing text in the editor with the text below. You may need to begin your code with “BEGIN TRANSACTION;”.

1. Create the database. This is essentially a shell structure for everything else. We will call the database historyinfo. Remember to include the semicolon at the end of the line and hit return.

CREATE DATABASE historyinfo;

You should receive a message such as “Query OK, 1 row affected (0.01 sec)”. If this line produces an error, try omitting it and starting with the next line.

2. Select database. Even though it is obvious to you which database you want to use, perhaps you will be working with several different databases in the future.

USE historyinfo;

You should receive a confirmation message similar to: “Database changed”, even though you didn’t really change anything.

3. Finally we will get to something more interesting. Create a table in the database, but at the same time, let’s add some fields (columns). Each line starts with the field name and then states the field type. Each field line ends with a comma, except for the last.

CREATE TABLE historyevents (
EventYear INT,
Event VARCHAR(255),
Description VARCHAR(255)
);

You should get some like like “Query OK, 0 rows affected” for a confirmation. Common field types include INT for whole numbers (-5,0,33, 2019), DECIMAL, VARCHAR( ) for text, where the number in parentheses is the allowed number of characters), DATE and TIME.

4. Enter some information into the database. We tell the database which table to use (even though we only have one), then you provide a list of fields, then a list of data (both lists should be in the same order).

INSERT INTO historyevents (EventYear, Event, Description)
VALUES ('1215', 'Magna Carta', 'Rights document signed by monarch in England');

You should receive something like “Query OK, 1 row affected” for confirmation. Yes, finally you have affected a row (record)!

5. Let’s perform a query to see the result of all of this effort. Retrieve some information from the database.

SELECT * FROM historyevents;

A nice little table should appear in the output.

+-----------+-------------+----------------------------------------------+                                         
| EventYear | Event       | Description                                  |                                         
+-----------+-------------+----------------------------------------------+                                         
|      1215 | Magna Carta | Rights document signed by monarch in England |                                         
+-----------+-------------+----------------------------------------------+                                         
1 row in set (0.00 sec)

Here, “*” tells the query to select all records from the historyevents table, but we only have one record, so that is all it shows.. There are many ways to make the query more specific. We can constrain the query to display only certain fields, or only show records with certain values in the field. There are many more options, and trying to figure out how to use them to get exactly which data you require can be challenging.

6. Next, enter two new records at once. You only have to type VALUES once. Remember to separate each record with a comma.

INSERT INTO historyevents (EventYear, Event, Description) VALUES ('960', 'Song Dynasty', 'Start of Song Dynasty in China'), ('1066', 'Norman Invasion', 'Normans Conquer of England');

Now let’s try a more specific query:

SELECT EventYear, Event
FROM historyevents
WHERE EventYear < 1100;

You should see results similar to:

+-----------+-----------------+                                                                                     
| EventYear | Event           |                                                                                     
+-----------+-----------------+                                                                                     
|       960 | Song Dynasty    |                                                                                     
|      1066 | Norman Invasion |                                                                                     
+-----------+-----------------+                                                                                     
2 rows in set (0.00 sec)

Here are some useful commands to get information about your database:

SELECT DATABASE();
SHOW TABLES;
DESCRIBE tablename;

6 Using Image Processing to Gain Superhero Vision

First published on August 24, 2019. Last updated on June 12, 2024.

Learning Objectives

Students will learn visual concepts and what can be achieved through image processing, such as bringing out hard-to-see text and details of documents and photographs.

Introduction

Image processing involves analyzing and altering images, typically photographs. Images can involve billions of pixels, so having automated techniques to process and analyze images is invaluable. Images can be analyzed to search for certain features, such as structures, or materials. Several ancient complexes have been discovered by image analysis. It is possible to search some images for certain types of minerals or plants.

Challenges

There can be several challenges to image processing and analysis.

Images might be blurry or out of focus.
Part of the image might be missing or faded.
The image might have become color distorted.
The image might not have the required color information.
Photos may have been taken at angles that make the subject difficult to view.
You don’t really know what your are looking for, or even if you do, what t looks like.

Suitable Image File Types

There are several common image file types:

JPEGs (or JPGs) are well-suited for photographs and offer the benefit of compression. This allows for reduced file size and faster loading. JPEGs also offer a wide range of colours.
GIFs are well-suited for line drawings with basic colours, as well as simple animations (known as animated GIFs), although the PNG format handles these sufficiently. GIFs are not well-suited to display a rich range of colors.
The PNG format is well suited for many types of graphics and is the default when creating certain graphics on OSX such as screenshots. Most newer versions of MS Windows can view PNGs, although some older versions cannot.
SVG code is used to generate images.

Processing with Python

Several image processing libraries have been written for the Python programming language. Using functions from those libraries, it is easy to examine single colors from an image, or even subtract one color from another, in order to highlight various features, such as vegetation or buildings. Below are examples of a raw image and images processed in Python.

: Photo taken in RGB (credit Google Earth)

: Red component of photo taken in RGB (Google Earth)

: Red – Blue processed photo taken in RGB (Google Earth)

Spectral Analysis

Hyperspectral analysis involves taking images in many different wavelengths instead of the simple red green blue (RGB) of most cameras. The wavelengths for each pixel are plotted to produce a spectral signature for each point. Those signatures are then compared to libraries of material signatures to identify specific substances, plants, or organisms.

Special images and software are required. The TNTmips is a free software program that includes a hyperspectral analysis component as well as a dedicated tutorial.

6.1 Image Processing Activities

First published on February 15, 2020. Last updated on June 12, 2024.

Activities

Students will use simple desktop image tools (such as Preview) to quickly examine images and find hidden type.
Students will learn basics of the Python language.
Student will learn how to use simple commands in a Python image analysis library package.

View the Course.Cafe Python Programming course.

6.2 Image Processing Leveling Up

First published on . Last updated on June 13, 2024.

Leveling Up

Learn some Python and use a Python image image analysis library package to manipulate images.

Processing Images Using Python

You can process images by using the Python programming language. The Python Image Library (PIL) makes some analysis relatively easy. UPDATE: this workshop is will be moving towards the PILLOW package, since PIL is no longer supported.

Make sure that you have a text editor on your computer. It should be able edit and save pure text. Applications such as MS Word are not very good at this. Applications such as Xcode or Textmate are suitable.
Make sure you have a terminal application on your computer. It might be called Terminal or HyperTerminal. If not, install a terminal program for your computer.
Make sure that you have Python 1.5.2 or newer installed on your computer. To check, open your terminal application and enter python –version
If you need to get or upgrade Python, click here.
Download, uncompress and move the file to your working directory: PIL_test.py
Run that file in your terminal with the command python PIL_test.py
Download and unzip this file: Image_Processing_Example_01
Run the file and practice changing the parameters. (See comments in file.)

Troubleshooting:

If the file says that PIL or Image are missing, download the library from here. (Note that PIL has been forked into Pillow. PIL (Pillow) installation can be challenging.
Use Google to find additional sources of help for your operating system).

7 Three Dimensional artifact modeling, preservation and reproduction

First published on August 24, 2019. Last updated on February 6, 2021.

Objectives

Students will learn how 3D scanning and printing can act as a means of historical artifact preservation and reproduction.

Introduction

In the ancient times, the only way to reproduce a document was to rewrite it by hand. Then the printing press was invented, first in Asia then in Europe, that allowed a document to be reproduced many times. However, traditional printing requires plates to be made with the document content. Those planes might be carved in stone or wood, etched in metal, or made of individual characters of type, but in all cases, significant effort was required. For one or a few copies, reproduction by writing was often the fastest and easiest, even for entire books. In modern times, the photocopier was invented which allowed easy and routine reproduction of single or a few copies of documents.

However, until recently, reproducing single copies of three-dimensional objects (which we will also call artifacts) remained a largely manual affair, even where tools were used. However, 2-D printers began to incorporate digital technology. So instead of operating at the character level, printers were able to operate at much smaller resolutions, literally at the dot or “pixel” level. Modern printers can print at resolutions of hundreds of pixels per inch, suitable for many types of characters and even photographs.

If a digital printer can print photographs in two dimensions, why can’t it print in 3 dimensions? In theory, it can, but in practice, a different sort of machine is required.

The Technology

Scanning

Special software can take scans or photos of physical objects and stitch the images together into a three-dimensional object. Often this approach produces sufficient qualities, but sometimes it can produces irregularities. For better precision, a three-dimensional scanner should be used, or place the object on a turntable to ensure more consistent positioning.

3D scanning used for research and preservation at the Smithsonian Institute:

CAD Software

For cases where the object is hypothetical or is not physically available, an object can be designed in drawing software called Computer-Aided Design (CAD). Autodesk offers a popular CAD suite. It is expensive and run better on expensive hardware. However, Autodesk offers a suite of free applications as well.

Tinkercad is a free, cloud-based CAD program. It is not very precise, but it is sufficient to design simpler objects.

Printing

3D printing can either create a physical object from a digital design file or from scans and photographs of pre-existing physical objects. 3D printers print objects one “slice” or layer at a time, usually beginning at the bottom and building upwards. Most common 3D printers print in plastic which can be easily melted and channeled, but other materials can be used such as metals, resins, gypsum and even chocolate, but not on the same printer!

Printer and Material Types

3D objects can be printed in a variety of different material. The print material does not have to be the same as the original artifact material. For example, an ancient silver coin can be printed in bright orange plastic. Nevertheless, similarity in look and feel is often desired.

The most popular print material is plastic, which can be melted at a relatively low temperature and is usually non-toxic. There are not many older historical artifacts whose original material was plastic. However, plastics come in many different colors, so it is possible to print objects that are similar in appearance (or at least color) to historical objects. Plastics printers can cost as little as a few hundred US dollars, but are more typically several hundred dollars. Common plastics include:

ABS is the most common print material. It is a tough material and relatively strong.
PLA (polylactic acid) is biodegradable and is made of renewable materials such as corn starch.

Gypsum printers use powdered white gypsum that can be dyed to make complex blends and gradients of colors. The end product resembled lightweight rock, so it is good for printing models of bones, skulls and pottery. The output is somewhat grainy in texture. Gypsum printers cost tens of thousands of US dollars, so they are less common than plastics printers.

Resin printers can produce high resolution and precision prints that have much smoother surfaces. They are good for high fidelity prints and can even be used to produce medical test chips. Resin printers can cost around a hundred thousand US dollars.

Metal printers use a process called sintering to produce metal prints. These printers operate at relatively high temperatures and can be dangerous. They are only found in advanced printing facilities or specialty workshops.

Counterfeiting and Other Legal Considerations

There are several legal considerations involved when making copies of objects (or of even two-dimensional items). It should be obvious that it is unethical and often illegal to pass off copies as the original object. When this involves modern currency and coinage, this is obvious: counterfeiting currency is an well-known crime. Counterfeiting modern coinage is illegal but less common, since the coins are often more expensive to make than their face value. Jurisdiction matters. For example, it is illegal to make fake U.S. currency in the USA but apparently not in North Korea.

Making a copy that is different but resembles an existing object can involve further legal restrictions. If you distort an artistic copy without the artist’s permission, you may be violating a specific legal right called a Moral Right which is enforced in California and parts of Europe but not in most of the USA.

Ironically, in some jurisdictions, possession of a copy might be legal where of the original might be illegal. For example, some European and middle-eastern countries prohibit the possession of ancient coins, while they allow possession of reproductions.

Other items might be patented or copyrighted (documents, paintings, sculptures, household objects). Patents generally don’t last more than a few decades, but copyrights can last over 100 years. Some business, organization and government logos might have perpetual protection. Some copying can be performed in the USA under the “Fair Use” doctrine, but not all countries have this doctrine. Some governments have other types of laws that might cover certain areas, such as religious or cultural objects.

Resources

Tinkercad free online CAD software

7.1 3D Activities

First published on February 15, 2020. Last updated on February 15, 2020.

Activities

Students will use a university or local maker space to 3D scan a sample historical object.
Students will model a historical object using a simple, online CAD tool such as Tinker CAD.
Students will use their university’s or a local maker space to print a 3D historical object (either from the scan or their model).

8 Obtaining, Processing and Analyzing Satellite Imagery for Historical Uses

First published on August 24, 2019. Last updated on June 13, 2024.

Learning Objectives

Students will learn about how both current and semi-historical (past 70 years) satellite imagery can be used to for historical purposes, such as locating and examining historical remains of past societies as well as current/recent regional terrains.
Students will learn the capabilities and limitations of satellite imagery, how to obtain imagery, how to process it and analyze it.
Identify sources of satellite images
Download and use several software packages to open, examine and process different types of images and remote sensing information.

Prerequisites

Knowledge of computers and patience.
Preferred: knowledge of how use a command line utility, especially for Linux/Unix.

Introduction

Obtaining, processing and analyzing satellite imagery for can be a valuable tool set for historians. It allows you to literally be an armchair archeologist. It also gives you some idea of what the geographical features are of a historical region (with the caveat that sometimes that changes over time).

This workshop introduces several tools for examining, processing and analyzing satellite images. Having your own laptop will be beneficial, since you can download and install the applications and files during the workshop and help each other. Be warned that some of these files and applications are large. Having at least a few gigabytes of free disk space is a necessity.

Example

Blue Marble Images of Silk Road

Historical Analysis With Remote Sensing

Introduction

Pyramids, palaces, great walls, entire ancient cities, and even Milwaukee’s Mitchell Park botanical domes…some people prefer their objects of material culture to be colossal! Numerous macro-material objects of historical interest can be observed utilizing satellite imagery and remote sensing. Many large geographic features, such as river areas and mountains have not changed greatly over time. Much in history can be illustrated and analyzed using satellite images, especially if one delves deeper than a mere visual sweep.

Fortunately, there has recently been a copious avalanche of freely-available satellite imagery by the National Aeronautics Space Administration, European Space Agency, Google and other parties. There is also an abundance of free tools and techniques that can greatly enhance the ability to employ such imagery to characterize large historical objects. There is even additional remote sensing data that allows historians to peer beneath the Earth’s surface down into the secrets of covered ruins. The digital historian merely peers: no digging is required!

Credit: Google Earth

Sources

Sources of free satellite imagery from government sources (Landsat, Sentinel, EOSDIS) and private sources (Google, Planet).
Approaches to obtain proprietary imagery for free
NASA Earthdata
Earthdata Human Dimensions
Human Dimensions data
Earth Data visualization tool
Moderate Resolution Imaging Spectroradiometer (MODIS) site

Tools

Easy-to-use, high-level tools to use and manipulate satellite imagery:

Google Earth
Stanford Orbis
Others

Image Analysis

Advanced image analysis and manipulation tools will be briefly introduced, such as Python image libraries, SNAP, HDF and TNTmips.

Activities

Students will start by using simple tools such as Google Earth to locate and examine areas of historical interest.
Students will obtain government-provided satellite images.

8.1 Satellite Imagery Activities

First published on February 15, 2020. Last updated on June 13, 2024.

Google Earth

Google Earth provides an easy way to get free satellite imagery, although what is freely available is quite limited in terms of time frequency and resolution.

Go to the Google Earth website and download Google Earth.
Open Google Earth.
Click on the button to the right of the magnifying glass to see panels.
Choose the Layerspanel to see available satellite images and sources.

Window showing Earth with control icons at top of screen.

Credit: Google Earth

Satellite image of Mountain View area with satellite image choices shown as overlay.

Credit: Google Earth

ESA Sentinal

There are five ESA Sentinel missions, each of which concentrate on different capabilities.

1. First sign up for a Copernicus account using Open Hub.

Copernicus main page

Let’s look at visible images (some infrared bands might be available)
Install the Sentinel 2 – Toolbox
Obtain an image through Copernicus.
Load the image into the toolbox.

Image panel, product explorer panel with information and band choices, and Navigator panel shown.

Sentinel Toolbox 2

Plot of frequency of correlation between two bands.

Sentinel Toolbox 2 scatter plot example

HDF

HDF stands for hierarchical data file. It is used for some remote sensing files.
Install HDFview for hierarchical data files (.h4, .h5, etc.)
Download this file: BUV-Nimbus04_L3zm_v01-00-2012m0203t144121.h5.txt
Open up this file in HDFview.

Hyperspectral Images and Analysis

Hyperspectral allows you to determine the materials viewed with much greater precision than most other methods. The amount of hyperspectral imagery available (especially for free) is quite limited, but the quantities should greatly increase in the next few years due to new missions by Satellogic and Planetary Resources.

The best way to learn how to use hyperspectral images is to download the TNPmips GIS software and go through the hyperspectral tutorial on that site.

Datum (formerly TNTmips) is GIS software that has a hyperspectral analysis component. There is a free version.

8.2 Level Up: Satellite Imagery

First published on . Last updated on February 20, 2021.

Leveling Up

Students will run Python scripts to process their images.

Processing Images Using Python

Make sure that you have a text editor on your computer. It should be able edit and save pure text. Applications such as MS Word are not very good at this. Applications such as Xcode or Textmate are suitable.
Make sure you have a terminal application on your computer. It might be called Terminal or HyperTerminal. If not, install a terminal program for your computer.
Make sure that you have Python 1.5.2 or newer installed on your computer. To check, open your terminal application and enter python –version
If you need to get or upgrade Python, click here.
Download, uncompress and move the file to your working directory: PIL_test.py
Run that file in your terminal with the command python PIL_test.py
If the file says that PIL or Image are missing, download the library from here. (PIL installation can be challenging. Use Google to find additional sources of help for your operating system).
PIL documentation can be found here.
Download and unzip this file: Image_Processing_Example_01
Run the file and practice changing the parameters. (See comments in file.)

9 Geographical Information Systems (GIS)

First published on August 24, 2019. Last updated on February 20, 2021.

Objectives

Students will learn about geographical information systems as a tool to bring together various information sources to provide a better spatial understanding and visualization of historical phenomena.
Finally, Geographical Information System (GIS) platforms will be explored. Open-source QGIS will be demonstrated.
Cloud-based software, such as QGIS cloud, and cloud-based Map Editor will also be demonstrated.

Geographical Information Systems (GIS)

Geographical Information Systems (GIS) are ways to store, view and analyze information concerning areas of the Earth’s surface, such as a neighborhood, region, country or continent. They can even be used for oceans and even other planets.

GIS can be used to create maps, territorial diagrams and visualizations and even perform analysis. A software platform must be used for GIS. There are proprietary GIS platforms with lots of features such as ArcGIS. They cost money, but your university might have a subscription that you can use. You can sometimes get a free trial copy. There are also “open source” or lite versions that you can get for free, such at MIPS or QGIS. It is strongly recommended to go through the tutorial for at least one platform. Most platforms are somewhat similar, so if you know one platform reasonably well, then the others will make much more sense.

Below is an introduction to GIS (credit ESRI):

GIS breaks spatial information down into layers. For example, a base layer might comprise a satellite image. Another layer might superimpose roads and buildings. Another layer might contain labels and annotation. You might import a layer, such as the satellite image, then manually create other layers such as the annotation. There are many types of layers that you can import, sometime from the platform vendor, but often from other sources such as government sites.

Remember that the Earth is not flat! So unless you are working with a small area on the Earth’s surface, you will need to take into account the curvature of the layers you import or add. The larger the area, the greater the curvature becomes. Most GIS platforms will let you choose a coordinate system, but then you must make certain that the other lays will work with it.

Below is an example showing a map with the Hawaiian Island chain in the Pacific Ocean. To the left is a panel showing the available layers. You can choose which layers to display by checking them. Here, the counties (representing islands), cities and urban area layers have been selected. To the right, you can see those layers displayed. At the lower right hand corner, you can see the selected coordinate system.

GIS view of Hawaii with cities and urban areas emphasized (QGIS)

Tech Talk

There is one technical detail that is particular important to know. There are two different kinds of layers. You need to keep in mind this difference, because you will sometimes need to be able to handle these types of layers in different ways.

A raster layer is essentially an image comprised of pixels. It may contain a satellite photograph, a scanned hand-drawn sketch or an artist’s rendition. Each point in a raster layer contains data (and often multiple items of data). Raster layers typically must be created and edited outside of a GIS. They also tend to be data intensive. A photograph of the Earth’s surface, such as from an airplane or satellite, is typically a raster image.

Satellite photo of Asia, northeast Afria and Eastern Europe

A raster image: satellite image of Asia (credit Google)

A vector layer comprises lines and points. For example, national boundaries can be shown as a series of small line segments, regardless of shape. A vector object can also include a shape that is comprised of line segments that totally enclose a specific area. These shapes can be filled with a color or pattern and still be vector objects. Locations of objects, such as post offices, can be shown by points and icons. Vector layers can typically be edited with relative ease and usually are not very data intensive. Often, one or more vector layers are superimposed upon a raster layer.

Satellite photo of Asia, northeast Afria and Eastern Europe with shaded areas and connection lines

A vector layer superimposed on a raster layer: satellite image of Asia showing political entities and possible connections (raster layer: credit Google)

Another technical detail is that when your GIS territory is either large or is in polar areas, you may need to adapt a special coordinate system. This is because the curvature of the Earth distorts the Earth’s surface. For a small area at a low latitude, ordinary Cartesian coordinates (e.g. flat) may suffice. Otherwise, specialized coordinates will be required to correct for the distortion of the curvature. For many GIS platforms, if you examine the properties of a layer (sometimes via right clicking the layer), you can often see the applied coordinate system.

Resources

ArcGIS Story Maps.
Neatline map and timeline tool for Omeka. This tool can provide some of the capabilities of GIS platforms.

9.1 GIS Activities

First published on February 15, 2020. Last updated on June 13, 2024.

Tryouts

Participants will need access to a computer either with GIS software already installed or the ability to install software. The computer should not be too slow and a good internet connection is recommended.

GIS Software and Information

QGIS and TNT MIPS are free GIS software. ArcGIS is expensive, but many schools have it available for free use for students.

QGIS open source GIS software. This is a relatively easy and free GIS software: downloads
Datum (formerly TNT MIPS) free GIS software with hyperspectral analysis capabilities: downloads
ArcGIS is proprietary, but sophisticated industry standard GIS software that many universities make available to students.

Activity

Students will get as far as they can through a standard GIS tutorial in the time available. Be warned that GIS software is not really easy to use and may run slowly on older or inexpensive hardware.

Suggested tasks:

Create an empty project in the GIS software and save it.
Create a raster layer. (Either use maps or images provided by software, or import a satellite image from Google Maps or a similar resource).
Create one of more vector layers to annotate the raster layer.
Save your project again and export the project as a single image.

10 Computational History and Simulations

First published on August 24, 2019. Last updated on February 20, 2021.

Objectives

Students will learn about the various types of historical simulations and how they can develop them.

Context

There are two schools of thought regarding the nature of history as a discipline. Some consider history to be among the humanities. Others consider it to be a social science. Of course, the two are not mutually exclusive. This work does not propose a quantitative approach as a replacement for narrative, scholarly approaches, but rather to provide useful additional tools.

Entering into quantitative methods begs several questions, such as “What can be quantified?”, “How accurate and meaningful is historical quantitative data?” “What can be modeled?” and “How accurate are such models?

Nearly anything can be quantitatively modeled, either directly or by proxy. Even love can be quantified, through proxies, such as in terms of hours a day thinking about someone or the cost an an engagement ring versus income or assets. Or perhaps even directly, via electrodes wired to the brain.

A more penetrating matter regards the accuracy of such models. Proxies can always be found, and some data can always be found, yet it may not be abundant or precise enough to produce models of sufficient accuracy for the purposes desired. This question must be answered on a case-by-case basis, although it can be possible to make generalizations about accuracy. For example, the further one goes back into history, there is generally less abundant and accurate data. Also, was casualties, especially further back in history, are often suspect, and tend to be exaggerated either up or down, depending on the perspective of the source.

Finally, is the cost worth it? Much data can be obtained, but it can often be costly to gather and process it. Can one afford that and is it worth the cost for the benefit obtained?

Types of Quantities

Many different things can be quantified, although some are more obvious than others. Battles are often quantified by the numbers of soldiers fighting on each side, as well as by casualties and reparations. There is often commercial and trade data, such as how much wheat was produced in a kingdom, or taxes on trade. There is geographic data, such as how many square kilometers a kingdom ruled, how much rainfall that area received, and how long were trade routes. Finally, there is time data, such as how long certain historical persons lived, or how long their dynasties endured.

Types of Models

Examples of simple models that can be easily visualized are introduced: linear, quadratic, exponential growth, logistic, and efficiency-discounted exponential growth (EDEG). Simulation tools are briefly introduced, such as pen-on-paper, MS Excel, Ruby, Python, R, Wolfram Alpha, Processing, SVG, and graphical information systems (GIS).

Computational History

Computational history is related to quantitative methods. Computational history is a subset of the digital humanities. Computational tools can be extremely useful for simulating, illustrating and visualizing certain aspects of history. That said, there should be a lot of thinking before computational techniques are brought into play. One can generate considerable data and even impressive graphics that don’t really mean anything, or are just plain incorrect or misleading. Remember the lessons of Merlin’s Apprentice!

Tools

Simulation tools are briefly introduced, such as pen-on-paper, MS Excel, Ruby, Python, R, Wolfram Alpha, Processing, SVG, and graphical information systems (GIS).

Creating And Using Models

Medieval facade. Main entrance flanked by two bell towers. Stonework.

Front exterior of Notre Dame cathedral, Paris, France

A model is a hypothesis about how something exists or works. A model could be a small version of something large, such as a table top copy of the Notre Dame cathedral in Paris. Such a model would represent the large, most important features of the cathedral such as the towers and flying buttresses, and possibly representations of some of the more distinctive smaller features such as the stained glass windows.

A model can also be one or a set of mathematical equations that relate one quantity to another. For example, an equation could relate dynasty power to time. Such a model could be refines, such as to represent central versus regional power. We will primarily be concerned with creating quantitative models.

Creating a quantitative model is really easy. Just relate two quantities to one other. For example, write the following equation:

\(quantity~of~Roman~empire~soldiers = year~in~CE\).

According that his model, the number of soldiers in the Roman empire is equal to the year in current era years. So in 100 CE (AD), the number of Roman imperial soldiers would be 100. This certainly is a model, because it produces results that can be compared with actual data. Historians evaluate the validity of such data, which may come from literary or archeological sources, and then can compare it with the model. A range of uncertainly is estimated. If the model produces a result that not within the range of uncertainty for the data, the model is rejected or revised. If the model fits within the range, then it is valid, although not necessarily absolutely correct (no model ever gets proved) or representative of ultimate truth. Generally, models that fit the data the best and are consistent with other valid models tend to be more accepted.

It often requires several attempts to get a valid model, and many attempts to obtain better ones. The above example concerning Roman soldiers can be quickly rejected using commonly available data.

Models can be improved by including additional terms. and changing parameters. For example, adding a baseline number of soldiers, and then a term that might take into account the growth of mercenaries might improve the accuracy of the model.

In history, often the available or accepted data is limited, and uncertainties may be high. So initially, a more pragmatic approach may be to propose models and explore to what extent they might be valid.

Resources

CoMSES collection of resources for computational model-based science
West Big Data Hub
Harvard Dataverse
Journal of World-Systems Research (possibly a place to publish simulation analysis)
Collaborative for Historical Information and Analysis (CHIA)

10.1 Computational History and Simulations Activities

First published on February 15, 2020. Last updated on February 20, 2021.

Activities

1) Students will “storyboard” their concept for a simulation.

2) Students will use a digital tool to create a historical simulation.

Leveling Up

You can write a computer program to create a quantitative simulation. Ruby is a good language for this because it is easy to use and mathematically robust.

11 Spatial and Network Simulations

First published on August 24, 2019. Last updated on June 14, 2024.

Objectives

Students will learn about spatial and network simulations such as cellular automata, networks and geographic relationships and the tools to develop such simulations.
Students become familiar with the applications of such simulation techniques to spatial history.

Introduction

Spatial history concerns the presentation and analysis of historical phenomena over areas of territory. Several tools can be used. Geographic Information Systems (GIS) are a popular tool, but require their own discussion.

Network approaches represent a society as a collection of interconnected nodes and study interactions via those connections. The nodes can be represented as objects (such as points, circles or otherwise) and the connections can be represented as lines or pairs of nodes.

Spatial Simulations

Spatial simulations typically involve a historical phenomena taking place over an area of land. That area can be represented on a map, satellite photo, floor plan or even a curved surface such as a globe.

There are several tools commonly used for spatial history. Some tools are primarily for display and analysis, such as Geographical Information Systems (GIS), sometimes called Historical Geographic Information Systems (HGIS). Other can actually conduct spatial simulations. The most common such tools are cellular automata. Such comprise an area (typically broken up into a grid or “cells”) upon which agents (representing objects such as people, animals, ships) exist. Some agents can move about, reproduce, consume resources and engage in other activities.

Riders on horses roam a grassy plain, leaving behind a trail to mark their claimed territory.

NetLogo is a popular tool for cellular automata (see above image), although it does require writing code. Cellular is an easier platform that offers a drag-and-drop interface.

Scalable Vector Graphics (SVG)

A good tool for displaying networks is Scalable Vector Graphics (SVG), which is a language that can produce graphics. It is intended to be embedded in html (web) pages. It works best to produce graphics with lines and shapes rather than photographic images. It works well to produce network diagrams.

How to Use SVG

SVG is code is simple to write, but it tends to require a lot of planning or trial and error to get everything to appear as desired. First, an area (canvas) is declared with a width and height. Then various objects, such as circles and a line are declared. You have to specify the position of each object on your canvas, as well as its size and color. Some objects such as ovals have additional characteristics to specify. Then the code is inserted into the body section of an html page.

The following is an example of SVG code.

<svg width="300" height="200">
    <line x1="50" y1="50" x2="200" y2="50" stroke="blue" stroke-width="2" />
    <circle cx="50" cy="50" r="40" stroke="black" stroke-width="4" fill="cyan" />
    <circle cx="200" cy="50" r="40" stroke="black" stroke-width="4" fill="yellow" />
<svg>

The output would appear thus:

Two circles connected by a horizontal line.

A simple SVG example.

SVG Used in A Simulation

Below is an example of a web simulation utilizing SVG. You can see how code similar to that above has been reused many times to form a more complex network diagram. It is possible to combine SVG with scripting languages to produce a dynamic simulation, or to trigger snapshots at any particular time (using the buttons). The code for the below simulation is more extensive than for the above example and too long to show here. However, the code is highly redundant, and it is not difficult to write.

History Grid (created in SVG)

Scripting in JavaScript

Javascript is a scripting language that is executed in web pages. It can be included directly in html files, or it can be in its own file and referenced by the html file. Javascript is responsible for much of the interactivity you encounter in webpages. Javascript can manipulate many of the objects in a web page.

Generally Javascript must be enclosed in script tags.

Let’s explore a simple script included in an html page. Below is some HTML code for a button, followed by some Javascript code. Clicking the button will pop up an alert window displaying “Hello!”

<button onclick="myPopupBox()"> Click here</button> 
<script> 
  function myPopupBox() { 
    alert("Hello!");
  }
</script>

D3 JavaScript library

D3 can be used for visual simulations with some interactivity. D3 is a Javascript Library to produce animated data visualizations. It drives many of the data visualizations you see online, such as in the New York Times. D3 is relatively easy for some tasks, but not others. It is much easier to use D3 if you already know some Javascript.

Resource

So see some live D3 demonstrations and for more information, see the d3 site.

Historical Applications

Now that you have seen some of the technologies, lets explore some of the historical applications of spatial history. One such application is ORBIS: the Stanford Geospatial Network Model of the Roman World. ORBIS shows Roman communication costs in terms of both time and expense by simulating movement along the principal routes of Roman road, navigable rivers, and sea routes in the Mediterranean. You can try out ORBIS by opening the above link.

Map of Roman Empire with dots for major population centrs and lines for transit.

ORBIS (credit Stanford U.)

Cellular

Cellular automata are an approach to simulate spatial social systems. They typically use a grid to represent space and agents to represent actors.

Try to create a system of interacting agents in Cellular, a Snap App application written by Monash University. For example below is a simulation of cows grazing. The green areas represent grass. The intensity of the green represents the height of the grass. As a cow grazes, an area of grass becomes depleted.

Cows on areas of grass of varying intensity

Cows grazing in Cellular simulation (Credit: Bernd Meyer at al., Monash University)

Below, we can see a video of the simulation in action, with cows moving about the field. Here, the agents are cows. In historical simulations, the agents could be invading raiders, lumberjacks, or other parties that can either move about or expand their territory.

Organizations, Conferences and Projects

European Social Simulation Association (ESSA)
MURI Migration Project
Stanford Spatial History Project

Resources

Open ABM
NetLogo
Social Simulation 2019 (course at U. Utrecht)

11.1 Spatial and Network Simulations Activities

First published on February 15, 2020. Last updated on June 14, 2024.

Activities

Students will use Cellular to try out and manipulate several cellular automata simulations. If Cellular is unavailable, more ambitious student can try Net Logo.
Students will code a visual network, using a tool such as SVG.

12 Getting Immersed In History with Virtual Reality

First published on August 24, 2019. Last updated on June 14, 2024.

Objectives

Students will learn about various types of virtual reality such as animated and real life imagery, the challenges in developing VR experiences, and the tools to do so.

Virtual Reality is Literally 3D Printing Inside Out

Virtual Reality (VR) allows a user to view graphic content in three dimensions, and move about the graphics. Typically virtual reality provides stereographic capability, so the user feels depth perception. Oculus makes dedicated virtual reality goggles. Samsung makes a headset that can convert an Android phone into a VR headset.

VR can be even used on ordinary laptop, tablet and phone screens. Although the experience does not use stereoscopy to produce 3D effects, one can still explore in three dimensions. For example, see the below example.

Roman Theater in Petra, Jordan (credit: Sitoo, CC BY-NC-ND 2.0)

Virtual reality scenes are most often one or more photographs or videos that have been manipulated or stitched together to form a continuous experience. Sometimes the images are computer generated. VR nearly always contains visual elements, but can be partially or entirely composed of sound or other sensual phenomena.

Although many VR scenes comprise actual images of places, some involve fabricated images. An example is the online 3D model of the Hidden Town 3D Christian David House.

Other 3D Visualization Technologies

Holography

Holography uses photographs taken with laser light to create 3D images. It has existed since the 1960s. Strictly-speaking, this is not a digital technology because it can be done with chemical film. This technique is used to produce many of the “3D” stickers found on consumer products, labels or security tags.

Volumetric Projectors

Volumetric projectors offer the opposite experience from standard VR. You view a three-dimensional image from the outside. These projectors work by projecting one or more optical images onto a spinning blade or other surface to provide the illusion of volume.

Augmented Reality

Augmented Reality (or AR) involves adding non-real elements over a real image displayed on a computer or device screen or from digital projectors. It could be silly facial features on photo-sharing software, or annotations or a vector direction field on an image of a historical site. Often modern forms of AR are interactive, so you can use them for analysis purposes.

Analysis

Virtual reality can provide a person with a spatial experience of being somewhere real or imaginary. It can allow a person to explore that place as is they were physically there. However, most VR demonstrations are limited. They only allow a user to stand in one place, or to go through pre-determined paths. They also may be limited in time. If these are generated scenes, rather than from actual photographs or videos, then they may not be entirely accurate and the author might have chosen to render details in a according to their own agenda, priorities and preferences. Even when based upon actual images, the path or what is shown can reflect the agenda or biases of the author.

12.1 Virtual and Augmented Reality Activities

First published on February 15, 2020. Last updated on February 15, 2020.

Activities

Students will gain exposure to several virtual experiences in their university’s maker space or studio. Students should try out: the Virtual Reality (VR) version of Google Earth, historical VR experiences, geographic VR tours and an Augmented Reality (AR) tool such as real time photo manipulation or annotating tools.
Students will storyboard a concept for a VR experience.

13 Gaming as a Form of Simulation

First published on August 24, 2019. Last updated on February 15, 2020.

Objectives

Students will learn how games can be used as a form of historical simulation for both exploration and educational purposes. Students will examine how the need to make games marketable induces certain biases and misrepresentations that impact their historical accuracy.

Animations

Animations are one form of simulation. Animations are a series of artificially-created graphic images that are played in rapid succession to produce the illusion of change. Animations can be recordings of manually drawn and manipulated images. Or they can be graphic objects manipulated by a computer program.

Animations can involve maps that show the change or borders, or the movement of people. They can show environmental or social change, such as the spread of urbanization.

A simple form of animation is an animated GIF, which is a series of several digital images bound together in a GIF file. A more complicated form is Flash, a proprietary format. CSS also supports some animation, as does Java. Animations often involve plugins, such as Flash, to display on web pages.

Video

Videos can show an entire simulation or snippets. Video clips are sometimes used in games, and are often a good way for secondary storyboarding of a game. Some videos can now be filed in 3D to provide an interactive exploration experience.

Videos are a series of still images that are shown in rapid succession to present the illusion of movement. Videos can include actual photography (movies) and animations. Videos are different than animations that are produced on the fly by software, just as cinema is different from live theatrical performances. However, if generated animations or theistical performances are recorded, then the recording might be a video.

Videos are very useful to demonstrate changing graphical images or for portraying interviews. They are also better for mind control and emotional manipulation, because the movie maker controls the timing of what is seen.

Video to display on computers comes in two major categories of formats. One category is produced by proprietary software and required such to be viewed, such as the.cmprojfile produced by Camtasia. Another category are standard formats to be viewed by a broad range of browsers, software and devices, such as .mp4 or .webm. Often you may produce the initial video in a proprietary format, then export it into standard formats.

Video is typically captured by a camera digitally or via film and then processed. (Some equipment can play original video recordings). You may yourself view videos in their raw form, but you will almost always want to edit and process them to show to others. Editing software such as Adobe Premier, Apple iMovie, or Camtasia, can edit video and output it in various forms. You may also wish to re-process the video for streaming over the web.

Game Psychology

Most games involve gamification attributes. For example, many games involve action with a focus on fights and battles. Players can earn points, property and special capabilities, hence providing a sense of accomplishment and “income” for effort performed. Many games allow for competition with other players and group recognition for success.

Below is a screen shot from a very simple historical computer game. The player rules a dynasty, and can invest, add to reserves, build up the military or engage in conquest. Though simple, it contains several important gaming attributes. First, is that the player is empowered o make important choices. Second is that their is a “reward” for smart actions, in this case being the rewards of investment or successful conquest and dynastic longevity. Finally, there is an underlying element of chance, so that the outcome of decisions is not entirely predictable.

Results from prior round are shown at top, along with action choices for next rount

A screenshot from a simple historical computer game

Critical Approach to Games

People generally play games for their entertainment value. However, generally, games are not reviewed by historical experts for authenticity. Even when they are, the games may nevertheless still contain distortions of historical facts and processes. As a historian, it is important to realize that history may not progress in the same manner as such games simulate.

Discussion Questions

What type of historical principles can be demonstrated via gaming?
What are students really learning?
Should games be peer-reviewed?

Resources

A Beginner’s Guide To Making Your First Video Game (Kotoku)
DynastyGame website (code, further information)

13.1 Gaming Activities

First published on February 15, 2020. Last updated on February 15, 2020.

Activities

Students will present and discuss examples of historical games.
Students will demonstrate such games if time permits.

14 Relevance of Machine Learning and Artificial Intelligence to History

First published on August 24, 2019. Last updated on June 14, 2024.

Learning Objectives

Students will be introduced to concepts and advances in machine learning and artificial intelligence.
The use of AI as an aide to historians and an image recognition and classification tool will be discussed.

What Is Machine Learning and Artificial Intelligence?

Machine learning is essentially telling a computer program that particular items of data are related to each other. For example, you could tell the program the width of a petal and the corresponding name of a plant variety. The program will then be able to make predictions of variety if you provide it petal widths in the future.

Machine learning is a form of artificial intelligence (AI). Artificial intelligence come in several forms, but more sophisticated machine learning is a common form. For example, if you show the program a series of photos, and then provide a tag for each photo (such as flower, dog, house), eventually the program will be able to recognize future photos.

Some experts say that AI will someday be able to completely replace thinking by humans. Other experts disagree, and say that while AI is good at some tasks, that it cannot reproduce all of human thinking processes. So far, the latter are correct, but as AI become further developed, only time will tell who is ultimately correct.

For historians, AI can be useful in going though large batches of text documents and photos and recognizing items of interest. Be warned that, like human minds, AI programs are not perfect. They make mistakes.

How AI and Machine Learning Work

Many AI and machine learning work by comparing something new to something known (or a set of known things). If the new item is sufficiently similar to a known item, then the program considers the new item to be of the same sort as the old item. For example, consider a flower classification system, that figures out whether a new flower is a daisy or rose. If the new flower has radially-extending petals, the program would probably consider it to be a daisy.

However, most programs don’t work on such a clear-cut, rules basis. Rather they work on a statistical basis, and must be trained. For example, a human will load photos of daisies and roses into the system, and indicate the flower type. The system will learn from this training. This way, cases that deviate from the ideal case can still be identified, for instance, a daisy viewed from on edge might still be identified successfully.

Natural Intelligence As An Anology

Humans and other animals have natural intelligence that is capable of learning. (Plants can learn, but the mechanism is much different than for animals and computers). For example, animals learn what to eat or not eat based upon taste corresponding reactions to food. Animals and people learn to recognize images. While some types of images seem to be hardwired into the brain (e.g. how cats react to snake-like objects such as strings and cucumbers). Others are learned, such as letters and words for people. Think about how you learn to recognize and react to images. Think about how movies train people, even within just a few hours.

ChatGPT

The recently famous ChatGPT system is an artificial intelligence-based chat bot, designed to have natural conversations with humans. It is not really intended for research or analysis, although people often use it for those things, sometimes with disastrous consequences. Use it at your own peril!

Resources-Iris Data Sets

Iris flower data set (Wikipedia)
Kaggle Machine Learning With Iris Data Set
UCI Iris Data Set

Tools

Tensorflow open source learning platform
IBM Watson expert system platform for research
PyTorch AI library for Python
Lobe AI platform

Activities

Teams of students will debate whether AI or “history-bots” will replace historians.
Try out Weka (for students with time and patience. The download and set-up process may take awhile, and some of the concepts may be odd at first, but this is one of the simpler machine learning/AI platforms).

15 The Future of Digital History & Future Topics

First published on . Last updated on June 9, 2024.

Who can say for certain what paths digital history will take in the future. More sources will likely be digitalized. Whether such sources will be freely available or restricted behind paywalls or policies remains to be seen. Virtual reality will likely move us further and further into The Matrix. Gaming and other technologies allow us to literally create and change history. The only thing certain is that there will be new technologies and new uses.

Associate Professor David Staley of Ohio State University discussing digital history and the future.

Academic Programs of Interest

There are many career and research possibilities for historians with digital knowledge history. Working in public history for a city, state, museum or library is a great example.

In case you wish to do further academic work, there are several universities with undergraduate and graduate programs in digital history.

Clemson University (PhD)
George Mason University

Digital History

By Mark Ciotola

Digital History

By Mark Ciotola

Table of Contents

Preface

1 What Is Digital Technology and How Historians Can Use It?

Learning Objectives

What Is Digital History?

Examples of Digital Tools

Resources

Recommended Reading

Further Reading

1.1 Introduction To Digital History Activities

Activities

Leveling Up

2 Becoming A Digital Explorer of the Online World of History

Objectives

Searching For Online Information

Types of Online Resources

Peer-Reviewed Articles

Books

Dissertations and Masters Theses

Hard-To-Find Materials

Primary Sources

Computer Security and Data Protection

Resources

Further Reading

2.1 Becoming A Digital Explorer Activities

Activities

Clio Online Web Tool

Restricted Online Resources Available to You

2.2 Level Up: Creating Web pages

Activity Objectives

Setting Up A Website

Styling Content With CSS

Hosting Your Page for Public Access

More Advanced Text Editors

3 Document Preservation and Retrieval—Saving Old Information With New Technology

Objectives

Traditional Means of Document Preservation

Saving Old Information With New Technology

Digital records and archives

Images of Primary Source Documents

Text-Based Documents of Primary Source

Secondary Sources

File Types

Text Files

Archival Files

Image Files

Audio-Video Files

Digital Preservation Technology

Resources, Platforms & Services

Further Reading

3.1 Document Preservation and Retrieval Activities

Activity Objective

Document Retrieval and Comprehension

4 Digging Deeper into Document Repositories

Activity Objectives

Additional Repositories

University Libraries

Public and Community Libraries

Historical Societies

Museums

Journals

Newspapers

Comprehensive

Resources

4.1 Digging Deeper into Document Repositories Activities

Activity Objectives

Text parsing and processing

4.2 Level Up: Perl and Workflows

Level Up

5 Databases

Database Concepts

Relationships

Reasons to Use Databases

Database Platforms

Custom-Developed Databases

Analysis: Databases Versus Parsing Programs