Regular Expressions

Flashcard 10 of 10

Given we have the following HTML document (as a string), we want to slice out all of the opening tags.

<div>
  <p class="content">
    Consider the SUT safe.
    <a href="https://www.google.com">
      Google
    </a>
    is your friend. <br />
    Another line here.
    <hr class="divider" />
    Final line
  </p>
</div>

Our first attempt comes close:

html_string = <<-HTML
<div>
  <p class="content">
    Consider the SUT safe.
    <a href="https://www.google.com">
      Google
    </a>
    is your friend. <br />
    Another line here.
    <hr class="divider" />
    Final line
  </p>
</div>
HTML

pattern = /<([a-z]+) *[^\/]*?>/

html_string.scan(pattern)
#=> [["div"], ["p"]]

but we seem to be missing the a tag in the list. What can we do to fix this?

Answer:

Reveal Answer