www.cloudninediscovery.com

Subscription Center

Sign up to receive eDiscovery Daily's articles via email or add the RSS feed to your newsreader of choice.

  • RSS Feed

Library

Browse eDiscovery Daily Blog

About the Bloggers

Brad Jenkins

Brad Jenkins, President and CEO of CloudNine Discovery, has over 20 years of experience leading customer focused companies in the litigation support arena. Brad has authored many articles on litigation support issues, and has spoken before national audiences on document management practices and solutions.

Doug Austin

Doug Austin, Professional Services Manager for CloudNine Discovery, has over 20 years experience providing legal technology consulting and technical project management services to numerous commercial and government clients. Doug has also authored several articles on eDiscovery best practices.

Jane Gennarelli

Jane Gennarelli is a principal of Magellan’s Law Corporation and has been assisting litigators in effectively handling discovery materials for over 30 years. She authored the company’s Best Practices in a Box™ content product and assists firms in applying technology to document handling tasks. She is a known expert and often does webinars and presentations for litigation support professionals around the country. Jane can be reached by email at jane@litigationbestpractices.com.

eDiscovery Best Practices: Search “Gotchas” Still Get You

December 06, 2011

By Doug Austin

 

A few days ago, I reviewed search syntax that one of my clients had prepared and noticed a couple of “gotchas” that typically cause problems.  While we’ve discussed them on this blog before, it was over a year ago (when eDiscovery Daily was still in its infancy and had a fraction of the readers it has today), so it bears covering them again.

Letting Your Wildcards Run Wild

This client liberally used wildcards to catch variations of words in their hits.  As noted previously, sometimes you can retrieve WAY more with your wildcards than you expect.  In this case, one of the wildcard terms was “win*” (presumably to catch win, wins, winner, winning, etc.).  Unfortunately, there are 253 words that begin with “win”, including wince, winch, wind, windbag, window, wine, wing, wink, winsome, winter, etc.

How do I know that there are 253 words that begin with “win”?  Am I an English professor?  No.  But, I did stay at a Holiday Inn Express last night.  Just kidding.

Actually, there is a site to show a list of words that begin with your search string.  Morewords.com shows a list of words that begin with your search string (e.g., to get all 253 words beginning with “win”, go here – simply substitute any characters for “win” in the URL to see the words that start with those characters).  This site enables you to test out your wildcard terms before using them in searches and substitute the variations you want if the wildcard search is likely to retrieve too many false hits.  Or, if you use an application like FirstPass™, powered by Venio FPR™, for first pass review, you can type the wildcard string in the search form, display all the words – in your collection – that begin with that string, and select the variations on which to search.  Either way enables you to avoid retrieving a lot of false hits you don’t want.

Those Stupid Word “Smart” Quotes

As many attorneys do, this client used Microsoft Word to prepare his proposed search syntax.  The last few versions of Microsoft Word, by default, automatically change straight quotation marks ( ' or " ) to curly quotes as you type. When you copy that text to a format that doesn’t support the smart quotes (such as HTML or a plain text editor), the quotes will show up as garbage characters because they are not supported ASCII characters.  So:

“smart quotes” aren’t very smart

will look like this…

âsmart quotesâ arenât very smart

And, your search will either return an error or some very odd results.

To learn how to disable the automatic changing of quotes to smart quotes or replace smart quotes already in a file, refer to this post from last year.  And, be careful, there’s a lot of “gotchas” out there that can cause search problems.  That’s why it’s always best to be a “STARR” and test your searches, refine and repeat them until they yield expected results.

So, what do you think?  Have you run into these “gotchas” in your searches? Please share any comments you might have or if you’d like to know more about a particular topic.

http://www.cloudninediscovery.com/support/upcoming-webinars.aspx

Comments

What Do You Think?

Please comment on the above article.

Name (required)
Email Address (required, but won’t be published)
Web Address (optional) Remember My Information
TypeKey/TypePad Login (optional)