The core feature of the ‘Ark of the Government’ website is to store and catalog US Federal Government documents (and perhaps later, other governments).
This is already accomplished on a number of other public and private sites, obviously, but I need to learn Drupal and I can’t think of anything better. 🙂
In WordPress, the fundamental database entry is known as a
'post' and stored in the
wp_posts table. In Drupal, the equivalent basic database entry is known as a
'node' and stored in the
We’ll need a special Drupal content type to store our Documents in a one-to-one relationship–every one
document gets one Drupal
Defining a ‘Government Document’
As I began this project I thought it would be as simple as ‘create a repository of federal documents.’ I didn’t realize the full scope of that idea, and how oversimplified it was. The distinction of what classifies laws, acts, statutes and more seems to be quite broad, and it may be difficult to compartmentalize them into neat little boxes.
Some simple questions help define the problem:
- “How many total documents are there?”
- “Who writes these documents?”
- “Where are these documents created?”
- “How can I get access to these documents?”
There is too much information to get started; I need to narrow the focus.
So to start, I’ll just be looking at enacted legislation by the US Congress.
A Brief Civics Refresher
As of this writing in the year 2017, it’s been 228 years since the first congressional session in 1789. Since every term is two years (as defined by the length of House of Representatives term), in January of 2017 we started on our 115th Congressional Term.
A piece of legislation, a ‘bill,’ can be introduced to the Senate or House of Representatives. The bill will be scrutinized, edited, and ultimately put to a vote in both bodies. If it passes both house of congress with a simple majority, it becomes ‘enrolled’ and goes to the President for approval. If the President approves and signs, it becomes a law. If the President vetoes, the legislation can still become a law if the House and Senate overrides the veto with a two-thirds majority vote.
Over the course of the last 114+ congressional terms, some 20,000 pieces of legislation have become law in this manner. As mentioned above, to simplify this exercise for now, we’ll just be using these enacted laws.
Of course, School House Rock famously simplified this process with their 3 minute ‘How a Bill Becomes a Law’.
An Example Document: The ADA
Congress.gov already hosts an excellent legislative document repository with simple field filtering. It lists 11,955 pieces of legislation that have become law.
The ADA started in the Senate in 1989, introduced and sponsored by Senator Tom Harkin of Iowa as S-933 from the 101st Congress.
When the bill became law it was formally enacted as Public Law 101-336 in 1990. It’s the Office of the Federal Register that assigns the Public Law number.
Here is the PDF as listed on the Government Publishing Office website. Below is an image of the document’s first of 52 pages.
So here we can easily identify the different ‘pieces’ of the document. It has both ‘short’ and ‘long’ titles, a table of contents, a Public Law number, a few dates and other piece of codification.
Most of these ‘pieces’ will be commonly shared across all Public Law documents. It’s this common information we’ll set up as Drupal ‘nodes’ and ‘fields’ in the next sections.
‘Document’ Content Type Schema
Drupal ‘Node’ Basics
A Drupal ‘node’ is a basic piece of content stored in the
node table of the database. It has a unique table ID, or
node.nid, along with standard content information like title, created and edited dates. There are additional pieces of information stored in the
node table, like the language, status and number of comments on the entry.
Each node also has ‘content type’ to group similar types of content. The default Drupal content types
article (for dynamic blog type content) and
page (for more permanent static content).
Every node’s default URL structure is
/node/1. This can be overwritten with a node alias, for example, using the title of the node as the URL
/the-node-title. Like most modern CMS’s Drupal URL aliases can get very complex; more on that below.
Drupal also has ‘Fields’: arbitrary buckets of information that can be attached to nodes. For example, the ‘body’ text of the node and the ‘attached image’ are both stored as fields.
Lastly, Drupal groups nodes together with ‘Taxonomies’ to classify content, with the use of
Our ‘Document’ content type will start pretty generically at first, similar to an ‘Article’ content type. It will have these fields:
- This is a required field that all Documents must have
- For now, we will also use a version of this field as the URL alias
- For human readability, this will be the ‘short title’ of the document.
- Using the ADA as an example, the ‘Title’ of our Document will be ‘Americans with Disabilities Act of 1990’
- That creates the question though: “Do all public laws have short titles?”
- This will potentially store all the text of the Public Law, though for now it will remain blank
- Document Publication Date
- This is the enactment date
- For the ADA, this date is ‘7/26/1990’ or
648950400as an epoch timestamp.
- The template should not show the Drupal
- Public Law Number
- A simple text field
- For the ADA, this is ‘101-336’
- A ‘Featured Image’
- A snapshot of the first page of the document
- For the ADA, this would be the image used in the above example
- No comments
- No revisions
We’ll keep this content type simple for now. Ultimately though, a ‘Document’ content type will be extended with these possible taxonomies, or connections to other possible content types
- Authoring Body
- Official Links to Government Websites
- Abstract ‘Declaration’ or ‘Rule’ the document attempts to implement
Creating Content Types from the Web Admin Interface
In Drupal, Content Types are stored as a configuration in the database, as opposed to a configuration in code. I’m accustomed to them being stored as code in WordPress, so this process may be a bit different for me at first.
In my opinion, site building configurations like this should be stored as code. However, I’m trying to learn as much as possible about Drupal, so I’ll try every way I can.
I ultimately want to know how Content Types are stored in the database, so first we’ll set up a bit of a backup configuration to reset when necessary. This way we can experiment a bit.
Backing up the Database with Drush
I may want to regularly back up the site during deployments. If the
CI_RELEASE variable is set (checking with this method), that means it’s in the middle of a deployment, and I’ll want to use that date as the file name. Otherwise I can use the current date with format
I’d like to update it to use the
DRUPAL_ROOT instead of
.., but this will do for now.
Quick restore the Database with Drush?
I might want a similar
drush-restore.sh script in the future, for now I’m gonna skip it and do it manually.
Creating the Content Type
Adding the Content Type is super fast. Go to
/admin/structure/types/add to find the web GUI.
There we’ll configure these settings:
- Name: Document
- Description: A government document
- Preview before submitting: Disabled
- Explanation or submission guidelines: A Document must have a title
- Display Settings: Uncheck “Display author and date information.”
- Comment Settings: Select “Closed” for “Default comment setting for new content”
Click “Save Content Type.”
Add Drupal Fields to ‘Document’ Content Types
According to the basic specification we created above we need to add a ‘Public Law Number’, ‘Publication Date’ and ‘Document Image’ fields.
Here’s how to add the fields. Go to
/admin/structure/types/manage/document/fields or click the “Manage Fields” tab in the edit content type admin screen.
‘Public Law Number’ is the simplest field to add, it’s a simple text field that doesn’t need to store a lot of information. Let’s limit that to 10 characters. The example Public Law Number was 101-336.
Installing the ‘Date’ Module
By default, you cannot add ‘dates’ as a field content type. We more than likely want to store the ‘Publication Date’ as a Unix Timestamp, though it might have some longterm boundary issues (a timestamp can only go so far back in time).
A timestamp will work for now though. To save fields with timestamps, we have add and enable the ‘Date’ module.
drush pm-enable date -y to the
releases.sh file to install and enable the Date module. Doing this of course will install and activate it later on the staging server, as created during the sub-theme post.
Now I can create the ‘Publication Date’ field.
Lastly, add the ‘Document Image’ field as an ‘image’ type. I left all the meta fields as default for now.
Adding the first Document
Now that we have our ‘Document’ Content Type configured with fields we can add the ADA.
/node/add/document/ and enter all the information as it appears below.
After saving, it will then appear like this on the front end. I later updated the URL alias field to
A Quick Look at the Database
Now that we have our ‘Document’ Content Type built and our first Node added, let’s take a quick look at the database.
This post gave a quick overview of defining and creating a new Drupal content type, along with showing how to create new nodes.
However, the content type configuration is still local to the development database and our configuration could use a bit of polish. The next steps will expand on this foundation and create a more usable admin.
Creating Content Types from Drush
There might be work arounds for this, like maybe using raw php scripts in the
releases.php deployment file.
Creating Content Types from Custom Modules
As mentioned content types should really be stored in code. This way they are tied to the repository and can be updated programatically.
Adding Advanced Fields and Relationships
We still need to add automatic URL aliasing based on the title and more fields, including external links taxonomies.
Importing Bulk Documents with Migration Script
Since the majority of this data is available publicly online, we should be able to rapidly import thousands of data points to quickly create a robust application.
- Drupal Docs: Understanding Drupal Content Types
- Drupal Docs: Working with content types and fields
- Drupal Docs: Backup Database with Drush
- Drupal Docs: About Nodes
- Drupal Project: Date
- USA: Laws and Regulations
- Wikipedia: ADA
- Wikipedia: United States Code
- Wikipedia: List of United States federal legislation
- Quora: What is the difference between law, act and statute?
- Senate.gov: Laws, Acts and Statutes
- House.gov: Legislative Process