Wilson Mar bio photo

Wilson Mar

Hello. Hire me!

Email me Calendar Skype call 310 320-7878

LinkedIn Twitter Gitter Instagram Youtube

Github Stackoverflow Pinterest

Yes, it’s a round-trip ticket

Español (Spanish)   Français (French)   Deutsch (German)   Italiano   Português   Cyrillic Russian   中文 (简体) Chinese (Simplified)   日本語 Japanese   한국어 Korean


This post is about converting existing HTML into markdown text in a file like README.md.

I wrote this because I haven’t seen an approach like this described.

I’m having to convert hundreds of pages I’ve written in HTML since the 90’s.

Let me help you with this. Call me!

Why Markdown?

Back in 2004, Apple pundit John Gruber came up with the idea after becoming frustrated by laborious HTML tags to properly format his content.

Markdown is a simple writing system which makes web-based documents both easier to write and easier to read in their raw state.

Many non-technical writers prefer writing Markdown text instead of using the mouse-enabled Microsoft Word. They say writing pure text allows them to keep their fingers firmly planted on the keyboard even as they apply formatting on the fly. Being able to format using text codes means they don’t have to stop typing or think about anything else to apply text styling.

This tutorial is for such people.

Automatic conversion

You can copy HTML and paste into Dom Christie’s website for conversion to Markdown:


Ordered lists

My favorite feature of Markdown is it automatically ordered numbers in lists!

We can begin all items of unordered lists with a 0.

1. First item.
0. Second item.
9. Third item.

The coding above Markdown renders correctly as 1,2,3.


In order for numbering to continue, all lines must be indented.

Heading lines can be indented.

Use 3 spaces in front of 3 backticks.

4 or more back-ticks is a signal to highlight the sentence in a box, not to indent.

Also, Liquid markdown does not recognize indention.

PROTIP: A workaround if you are not able to get automatic numbering: code the numbering yourself.

To make Markdown interpret a paragraph starting with a number as a list, put a left-slash in front of the dot, as in:

1492\. That was the year.

Line breaks

Both styles of line break tags result in a new line (without a blank line in between):

the XHTML style:

Hello<br />there

or HTML-style tags:



One reason Markdown text is easier to write than HTML is there is no need for <p> to force a blank line.

Just a blank line will do.

One can do a mass change of <p> in a text editor.

Remember to clean up ending </p> tags.

Bulk change HTML to Markdown programs

You can specify a URL to a HTML file:

It returns a page of Markdown text you can copy and paste to a Markdown file.

The author of that site provides his Python program at:

Download and run the program using this syntax (assuming Pythong is installed):

   chmod a+x html2text.py ; ./html2text.py erlang.html

PROTIP: Automatic approaches today are usually too automatic, converting what is better left in HTML.

Unordered Lists

CAUTION: Even though HTML can be written or pasted into markdown (.md) files, HTML must be more correct than HTML read by internet browsers.

  • There must be a blank line before <ul> or <ol>.

  • For every <li> there needs to be a </li> or the rendering goes wacky.

  • There must be a blank line after anchor tags <a name=... and a heading text line.

PROTIP: Markdown recognizes different characters to parse into lists:

* Asterisk

+ plus sign

- minus sign

render as:

  • Asterisk

  • plus sign

  • minus sign

Special characters

Markdown treats these characters as ordinary text if there is backslash escape character in front of them:

  • \\ backslash itself
  • \` backtick
  • \* asterisk
  • \_ underscore
  • \{ \} curly braces
  • \[ \] square brackets
  • \( \) parentheses
  • \# hash mark
  • \+ plus sign
  • \- minus sign (hyphen)
  • \. dot
  • \! exclamation mark

PROTIP: If a URL contains attributes, convert & (ampersand)

Another aspect where it would be helpful to use tools is conversion of some special characters that Markdown converts into escape entities that begin with an & (ampersand),

  • < (less than) is turned into &lt;

  • > (greater than) is turned into &gt; because that’s used to signify block quotes in Markdown.

  • the ampersand itself turns to &amp;, as in link URLs.


Instead of the opening <h2> and such tags, replace with ## (called Atx-style headers).

Markdown recognizes up to 6 hash characters for 6 levels.

The ending ‘##’ character is optional. It can be any number of characters.


Alternately, Setext-style headers are specified (“underlined”) by a series of equal signs (for first-level headers) and dashes (for second-level headers):

First-level H1 headers

Second-level H2 headers

Tables in HTML

HTML tables renders well from within Markdown text document.

However, some HTML tables were used in the early days of the internet were used to format an entire page. Such coding would need surgery to look well since tables are now intended to fit into a text column.

Bold and italics

CAUTION: Markdown coding are not processed within HTML tables.

So within tables continue to bold with

<strong>emphasized</strong> rather than Markdown __emphasized__ or **emphasized**

which renders as:

emphasized rather than Markdown emphasized or emphasized

Continue to italicize with:

<em>italicized</em> rather than Markdown _italicized_ or *italicized*

which renders as:

italicized rather than Markdown italicized or italicized


To see your markdown turn into HTML, use this online tool:

The easiest way to convert HTML to Markdown text is to use Aaron Swartz’s

My experience is that we’ll need to pretty much go through each line to make it look good in Markdown text.

PROTIP: Keep coding HTML to link to external sites and images.

Example of HTML:

<a taget="_blank" title="hello" href="http://wilsonmar.github.io/">my site</a>

The biggest hassle with converting to Markdown text from HTML coding is that Markdown reverses the order of text and links.


The same goest for the alternate “automatic” format Markdown offers to link:


I’m reluctant to put external links in Markdown because they open in the same window, causing my site to lose visitors to that site.

![mysite logo](http://wilsonmar.github.io/favicon.png/ "optional title")

Notice that links to images would have an exclaimation point in front.

Markdown currently has no syntax for specifying the dimensions of an image.

To embed a YouTube video, use an HTML iframe.

<iframe width="560" height="315" src="https://www.youtube.com/embed/Onv9nhPIBp0" frameborder="0" allowfullscreen> </iframe>

To specify starting the video at a specific time (1 minute 2 seconds), use a link such as:

<a target="_blank" href="https://www.youtube.com/watch?v=Onv9nhPIBp0&t=1m2s">Link to YouTube</a>.

Horizontal rule

A line going across the page in HTML is:

<hr />

Blockquotes in HTML

Markdown ignores the HTML <blockquote> tag. So this appear as if it was not surrounded by the tag:

This is a block quote.

Different Parsers

The trouble with Markdown code is that different parsers render them differently into HTML.

In March, 2016 GitHub switched to the Kramdown parser which claims to incorporate the capabilities of other parsers:

Liquid Markdown Syntax

Markdown text in GitHub recognizes Liquid syntax as defined in:

This coding would process html as such between a set of Liquid {% tag markers:

{% highlight html %}
{% endhighlight %}

Liquid output markup can also be specified between two curly braces, such as:

{{ page.heading | upcase | truncate: 8 }}

The page.heading refers to the heading variable specified in the front matter at the top of the file.

To display Liquid markup in documentation:

{% highlight html %}{% raw %}
{{ page.heading | upcase | truncate: 8 }}
{% endraw %}{% endhighlight %}

In fact, Liquid is a rather (simple yet complete) programming language on its own right, with if/then/else, for loops, etc. The home page for Liquid template language (written in Ruby):


What if we pasted JavaScript (wrapped between <script> tags) in Markdown?


This incorporates the thorough detail about markdown coding at:

A discussion forum about markdown is at:


More on front-end styling

This is one of several topics:

  1. Text Editors
  2. Markdown text for GitHub from HTML
  3. 508 Accessibility

  4. HTTP/2 Transition Project Plan
  5. Static websites
  6. JAM Stack Website Project Plan
  7. Jekyll Site Development
  8. Gatsby app generator

  9. Website styles
  10. Website Styling
  11. VueJs front-end framework

  12. Protractor to automate testing of Angular and other web pages

  13. Email from website
  14. Search within Hyde format Jekyll websites
  15. Windows Tile Pin Picture to Website Feed

  16. Data Visualization using Tableau

More about Git & GitHub

This is one of a series on Git and GitHub:

  1. Git and GitHub videos

  2. Why Git? (file-based backups vs Git clone)
  3. Git Markdown text

  4. Git basics (script)
  5. Git whoops (correct mistakes)
  6. Git messages (in commits)

  7. Git command shortcuts
  8. Git custom commands

  9. Git-client based workflows

  10. Git HEAD (Commitish references)

  11. Git interactive merge (imerge)
  12. Git patch
  13. Git rebase

  14. Git utilities

  15. Git hooks
  16. GitHub data security
  17. TFS vs GitHub

  18. GitHub REST API
  19. GitHub GraphQL API
  20. GitHub PowerShell API Programming
  21. GitHub GraphQL PowerShell Module