Overview

These instructions walk you through the basics of getting started with creating your own repository for publishing treebank data on GitHub using the Perseids Publications Treebank Template.

Minimum Prerequisities

  1. Have some treebank data files to publish. Currently this publication template requires treebank XML files that adhere to the Perseus Ancient Greek and Latin Treebank (AGDT) Schema and one of the tagsets supported by Arethusa. For more information on compliant treebank data with Perseids and Arethusa see the Perseids website or email perseids at tufts dot edu.
  2. Create an account on GitHub
  3. Although not absolutely necessary just to get started, we recommend getting familiar with how to use GitHub and the Git version control system as well as having a basic understanding of how to edit and create create JSON files. Without these skills it may be difficult for you to fully manage your treebank publication site and troubleshoot any problems you might encounter with it.

Instructions

  1. Fork the base Repository
    1. Login to your GitHub account at https://github.com
    2. Go to https://github.com/perseids-publications/treebank-template
    3. Click the Fork button.
      This puts a copy of the treebank-template repository into your own GitHub account.
  2. Set Up GitHub Actions
    1. Immediately after forking the repository, you should be prompted by GitHub to enable actions on your repository. Click Set Up Actions.
    2. (If you aren't prompted, click on the Actions tab yourself anyway.)
    3. Click I understand my workflows, go ahead and enable them.
      This step is necessary in order for GitHub to update your GitHub Pages site automatically whenever you update or add files to your repository. For more information on GitHub actions see GitHub Actions Help.
  3. Update the home page link for your site
    1. Click the package.json file in the root directory of the repository.
    2. Click the edit icon.
    3. Replace perseids-publications.github.io with, e.g., yourgithubaccount.github.io. That is, if your GitHub user name is "janedoe" you would change https://perseids-publications.github.io/treebank-template/ to https://janedoe.github.io/treebank-template/
    4. Scroll to the bottom of the page to Commit Changes and enter a message to describe your change to this file. (e.g. something like "Updated homepage link")
    5. Make sure the option Commit directly to the master branch is selected
    6. Click Commit Changes
  4. Set Up GitHub Pages
    1. Click the Settings button
    2. Scroll down to the GitHub Pages section of the page.
    3. Select gh-pages branch for the source from the dropdown. NOTE: it is important that you select the gh-pages branch option from the dropdown, even if its preselected. It seems that the process of actually selecting it is necessary to trigger GitHub to actually acknowledge the setting. After this, you should see a notice Your site is ready to be published at https://youraccount.github.io/treebank-template
  5. Update Site Details
    1. Go back to the Code tag and navigate to the src directory. Click on the config.json file.
    2. Edit the file and update the following data fields:
      • title: set this to whatever you whe the title of your site to be.
      • subtitle: set this to a subtitle you want to show for your site. It can be the empty string ("").
      • doi: set this to the empty string for now (""). (More information on adding a DOI is provided in the repository README.md file.)
      • copyright: set this to whatever copyright statement you want for your data files. We recommend using a Creative Commons license.
      • report: if you want people to be able to report any problems they find in your data, you can set this to the issues url for your repository by replacing perseids-publications with your GitHub account name. Otherwise set it to the empty string ("").
      • github: replace perseids-publications with your GitHub account name.
      • twitter: replace this with your own Twitter handle url or else the empty string ("")
    3. Scroll to the bottom of the page to Commit Changes and enter a message to describe your change to this file. (e.g. something like "Updated site config.")
    4. Make sure the option Commit directly to the master branch is selected
  6. Go back to the Actions tab.
  7. You should see that your build workflow is in progress.
  8. And finally that it succeeds.
  9. At this point, your GitHub web page should be published and updated at https://(youraccountname).github.io
    So far it is published only with the default treebank files that come with the template. The next steps show you had to add a new file.
  10. Return to the Code tab of the repository in GitHub and navigate to the public/xml directory.
  11. Click the Upload button.
  12. Upload your file.
  13. Navigate to the src directory of your repository.
  14. Edit the config.json file again.
  15. You need to add a new entry for the file you just added to the the publications section of the file. For each file added you need to define the following fields:
  16. It's a good idea to make sure the file parses as valid JSON prior to saving it. The JSON validator at https://jsonlint.com is a good resource to use for that.
  17. Save and commit your changes to the master branch.
  18. After the build succeeds, you should now see your new tree publication added to the site.

You now have the basics of how to get started with creating a GitHub based publication of your treebank data using the Perseids Publications treebank template. From here you can proceed to remove the sample data from the publication by removing the files from the public/xml directory and removing the corresponding entries in the src/config.json file (doing so requires a local clone of your GitHub respository). You can also add additional files and publications, create a DOI for your data and integrate it with Alpheios. For additional information see also the repository's README.md file and the docs/Config.md file.