Chisel

Before You Begin

This project is going to center on Markdown and HTML. If you don’t have experience with those two technologies it’ll be hard to really understand how to construct/implement your code.

If you’d like to build just enough familiarity, try these resource:

Introduction

HTML is an amazing tool for marking up documents, but it’s not very fun for writing content.

Several years ago, John Gruber proposed the idea of Markdown. It’s a style of text formatting that’s less obtrusive than writing HTML, is easy to remember, and is highly readable even when not converted to HTML. Here’s an example:

# My Life in Desserts

## Chapter 1: The Beginning

"You just *have* to try the cheesecake," he said. "Ever since it appeared in
**Food & Wine** this place has been packed every night."

Using a markdown parser, we could convert that example markdown document into the following chunk of HTML:

<h1>My Life in Desserts</h1>

<h2>Chapter 1: The Beginning</h2>

<p>"You just <em>have</em> to try the cheesecake," he said. "Ever since it appeared in <strong>Food & Wine</strong> this place has been packed every night."</p>

Experimenting with Markdown

There are markdown parsers available for just about every language you can imagine. In the Ruby world some of the best known are Redcarpet, Rdiscount, and RedCloth.

Let’s experiment with RedCarpet. Start by installing the gem:

$ gem install redcarpet

Then let’s start IRB and load the gem:

$ irb
> require 'redcarpet'

Now we can use redcarpet from a pry session to render the snippet of markdown we looked at before:

renderer = Redcarpet::Render::HTML.new
engine = Redcarpet::Markdown.new(renderer)
markdown_source = "# My Life in Desserts\n\n## Chapter 1: The Beginning\n\n\"You just *have* to try the cheesecake,\" he said. \"Ever since it appeared in **Food & Wine** this place has been packed every night.\""
engine.render(markdown_source)
=> "<h1>My Life in Desserts</h1>\n\n<h2>Chapter 1: The Beginning</h2>\n\n<p>&quot;You just <em>have</em> to try the cheesecake,&quot; he said. &quot;Ever since it appeared in <strong>Food &amp; Wine</strong> this place has been packed every night.&quot;</p>\n"

During this project, we’ll be building a simple markdown parser that performs some of the functions of Redcarpet!

As you work, it will sometimes be useful to use Redcarpet to check your work by validating that your code handles a chunk of markdown the same way it does.

Learning Goals / Areas of Focus

  • Practice breaking a program into logical components
  • Test components in isolation and in combination
  • Apply Enumerable techniques in a real context
  • Read text from and write text to files

Base Expectations

An Interaction Model

We’re going to use Chisel from the command line, reading in Markdown files and writing out HTML. It’ll go like this:

$ ruby ./lib/chisel.rb my_input.markdown my_output.html
Converted my_input.markdown (6 lines) to my_output.html (8 lines)

Where my_input.markdown is a file like this:

# My Life in Desserts

## Chapter 1: The Beginning

"You just *have* to try the cheesecake," he said. "Ever since it appeared in
**Food & Wine** this place has been packed every night."

And the resulting my_output.html would contain the following:

<h1>My Life in Desserts</h1>

<h2>Chapter 1: The Beginning</h2>

<p>
  "You just <em>have</em> to try the cheesecake," he said. "Ever since it appeared in
  <strong>Food &amp; Wine</strong> this place has been packed every night."
</p>

Got it?

Restrictions

  • Don’t use any regular expressions
  • Only use existing parsers to generate sample output or to validate your output

Expected Functionality

Parsing Markdown is a good application of spiraling design. As such, your expected functionality is broken down into levels. But all of these levels should be completed in order to earn full marks.

Level 1 - Text Basics

A chunk of text is defined as one or more lines of content which does not contain any blank lines. For example, this is one chunk of text:

Paragraphs

By default, a free-standing line of text in a markdown document will go into a <p> tag. For example, this text:

This is the first line of the paragraph.

Would be rendered as:

<p>This is the first line of the paragraph.</p>

Additionally, lines separated by a single line break remain part of the same paragraph. For example this markdown:

This is the first line of the paragraph.
This is the second line of the same paragraph.

Becomes:

<p>This is the first line of the paragraph. This is the second line of the same paragraph.</p>

If we want to create multiple paragraphs, we need to insert 2 line breaks to separate the lines:

This is the first line of the first paragraph.

This is the first line of the second paragraph.

Becomes:

<p>This is the first line of the first paragraph.</p>
<p>This is the first line of the second paragraph.</p>
Headers

The other basic text entity we’ll support is the header. Headers are used in documents to indicate a headline in large text. HTML supports different levels of header tags: <h1>, <h2>, <h3>, etc.

In markdown, we create a header with some number of # signs (corresponding to the header level) followed by the text for the header.

For example:

## Here's an H2

Becomes

<h2>Here's an H2</h2>

Note that unlike paragraphs, markdown headers only contain one line. So this:

# Header
followed by text

Becomes:

<h1>Header</h1>
<p>followed by text</p>

And:

## Header 1
## Header 2

Becomes:

<h2>Header 1</h2>
<h2>Header 2</h2>
Requirements

Build up your Chisel so it supports:

  • A chunk of text starting with #, ##, ###, ####, or ##### is turned into an HTML header (<h1>, <h2>) with the header level corresponding to the number of # symbols
  • A chunk of text not starting with # is turned into a paragraph

Level 2 - Formatting

With Level 1 completed, move on to Level 2:

  • Within either a header or a paragraph, any word or words wrapped in * should be enclosed in <em> tags
  • Within either a header or a paragraph, any word or words wrapped in ** should be enclosed in <strong> tags

Make sure to consider scenarios like this: My *emphasized and **stronged** text* is awesome.

Level 3 - Lists

Often in writing we want to create unordered (bullet) or ordered (numbered) lists. Build support for unordered lists like this:

My favorite cuisines are:

* Sushi
* Barbeque
* Mexican

Which should output:

<p>
  My favorite cuisines are:
</p>

<ul>
  <li>Sushi</li>
  <li>Barbeque</li>
  <li>Mexican</li>
</ul>

Then build support for ordered lists which use numbers for the markers. Though, confusingly, the numbers themselves don’t matter. Some authors use 1. to mark every list element and let the HTML renderer (aka browser) figure things out:

My favorite cuisines are:

1. Sushi
2. Barbeque
3. Mexican

Which is turned into:

<p>
  My favorite cuisines are:
</p>

<ol>
  <li>Sushi</li>
  <li>Barbeque</li>
  <li>Mexican</li>
</ol>

Extensions

If you finish all the base expectations, consider implementing two of these extensions:

Images

Add support for images, both with and without the optional title attribute. Don’t implement the Reference-Style Links. See the specification

Blocks & Blockquotes

Add support for both Blockquotes and Code Blocks.

At this point you’re familiar with the basics of how Markdown works. Go straight to the source to see how HTML links should work. You do not need to implement the “Reference-Style Links”, just the normal inline ones.

Revisit the documentation about Links and build up support for the reference-style links it describes.

Reverse Chisel

Can you implement a reverser which takes in HTML and outputs Markdown?

This extension is quite hard, so it counts double.

Evaluation Rubric

The project will be assessed with the following guidelines:

  • 4: Above expectations
  • 3: Meets expectations
  • 2: Below expectations
  • 1: Well-below expectations

Expectations:

1. Ruby Syntax & Style

  • Applies appropriate attribute encapsulation
  • Developer creates instance and local variables appropriately
  • Naming follows convention (is idiomatic)
  • Ruby methods used are logical and readable
  • Developer implements appropriate enumerable methods (#each is used sparingly)
  • Code is indented properly
  • Code does not exceed 80 characters per line
  • Each class has correctly-named files and corresponding test files in the proper directories

2. Breaking Logic into Components

  • Code is effectively broken into methods & classes
  • Developer writes methods less than 8 lines
  • No more than 3 methods break the principle of SRP

3. Test-Driven Development

  • Each method is tested
  • Functionality is accurately covered
  • Tests implement Ruby syntax & style
  • Balances unit and integration tests
  • Evidence of edge cases testing
  • A test RakeTask is implemented

4. Functionality

  • Application meets all requirements (extension not req’d)

5. Version Control

  • Developer commits at a pace of at least 1 commit per hour
  • Developer implements branching and PRs
  • The final submitted version is merged into master