Friday, November 18, 2011

Who Else is Behind SOPA? (Other Than Hollywood, We All Know That)

First what is SOPA anyways? Hit the link to learn more about it if you don't already know. Perhaps a more important question, what does it have to do with someone living outside US? After all, they should bear the consequences of letting a bunch of Hollywood studios control their congress. But that's not the entire issue here; we are talking about a country that more or less is still in control of significant part of Internet infrastructure and their (stupid) actions can affect everyone else.

In Australia, we had this blacklisting and Internet censorship issue too. Fortunately, due to significant opposition (yeah, surprisingly even from US government!) it has gone away for a while and now late to the bandwagon is US congress. Going back to the topic, we all know Hollywood is stuck to the business model of ripping off people and they try everything to force people to fill up their pockets but what interests me most is the support of Business Software Alliance. Here is the list of its members:

  • Adobe
  • Apple
  • Autodesk
  • AVEVA
  • AVG
  • Bentley Systems
  • CA
  • Cadence Design Systems
  • CNC Software - Mastercam
  • Compuware
  • Corel
  • Dassault Systèmes SolidWorks Corporation
  • Dell
  • Intel
  • Intuit
  • Kaspersky
  • McAfee
  • Microsoft
  • Minitab
  • Progress Software
  • PTC
  • Quark
  • Quest
  • Rosetta Stone
  • Siemens PLM Software, Inc.
  • Sybase
  • Symantec
  • TechSmith
  • The MathWorks
It is very sad to see many big software company behind SOPA. If you live is US, get involve and do something about it. For the rest of us, consider not to purchase software or hardware from these companies to let them now it takes more than quality products (I'm referring to Intel here, I'm no Apple fan) to gain customers.

Links:
SOPA Wikipedia Page

Tuesday, November 15, 2011

Truncating HTML in Java

I'm currently working on a web application written in Java. Part of this application generate summary of long HTML texts by truncating them to a fixed size. In Python, I usually use truncate_html_words of Django template engine so I looked for a similar easy method in Java.

I did a couple searches on Google but couldn't find something quick and easy. So, I went back and looked at the Django's code and fortunately it was very straight forward. Find the adapted code below:
/**
 * Copyright (c) Django Software Foundation and individual contributors.
 * All rights reserved.
 * 
 * Copyright (c) 2011 Masood Behabadi <masood@dentcat.com>
 *
 * Redistribution and use in source and binary forms, with or without modification,
 * are permitted provided that the following conditions are met:
 *
 *    1. Redistributions of source code must retain the above copyright notice, 
 *       this list of conditions and the following disclaimer.
 *    
 *    2. Redistributions in binary form must reproduce the above copyright 
 *       notice, this list of conditions and the following disclaimer in the
 *       documentation and/or other materials provided with the distribution.
 *
 *    3. Neither the name of Django nor the names of its contributors may be used
 *       to endorse or promote products derived from this software without
 *       specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
 * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
 * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
 * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
 * ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 */ 

public static String truncateHtmlWords(String html, int length){
 if (length <= 0)
  return new String();
 
 List<string> html4Singlets = Arrays.asList(
  "br", "col", "link", "base", "img",
  "param", "area", "hr", "input");
 // Set up regular expressions
 Pattern pWords = Pattern.compile("&.*?;|<.*?>|(\\w[\\w-]*)");
 Pattern pTag = Pattern.compile("<(/)?([^ ]+?)(?: (/)| .*?)?>");
 Matcher mWords = pWords.matcher(html);
 // Count non-HTML words and keep note of open tags
 int endTextPos = 0;
 int words = 0;
 List<string> openTags = new ArrayList<string>();
 while (words <= length) {
  if (!mWords.find())
   break;
  if (mWords.group(1) != null) {
   // It's an actual non-HTML word
   words += 1;
   if (words == length)
    endTextPos = mWords.end();     
   continue;
  }
  // Check for tag
  Matcher tag = pTag.matcher(mWords.group());
  if (!tag.find() || endTextPos != 0)
   // Don't worry about non tags or tags after our
   // truncate point
   continue;
  String closingTag  = tag.group(1);
  // Element names are always case-insensitive
  String tagName     = tag.group(2).toLowerCase();
  String selfClosing = tag.group(3);
  if (closingTag != null) {
   int i = openTags.indexOf(tagName);
   if (i != -1)
    openTags = openTags.subList(i + 1, openTags.size());
  }
  else if (selfClosing == null && !html4Singlets.contains(tagName))
   openTags.add(0, tagName);
 }
 
 if (words <= length)
  return html;
 StringBuilder out = new StringBuilder(html.substring(0, endTextPos));
 for (String tag: openTags)
  out.append("");
 
 return out.toString();
}

Feel free to use the code under Modified-BSD License but keep in mind unlike Django, it's not been thoroughly tested and may not function correctly in all cases.

Links:
Django Project Website
Original Source Code

Thursday, November 3, 2011

Long Domain Names and RDP Clients

A few days ago, I encounter a strange authentication problem when I tried to connect to a Windows Server 2008 via RDP using several Android remote desktop applications. The server was running AD and had an unusually long domain name. While Vista's native remote desktop client had no problem connecting to the server, both Jump Desktop and Pocket Cloud on my Android phone were complaining about "wrong username/password". On Fedora, remmina wasn't doing any better (latest 0.9.3) so I decided to find out why.

After a bit of inspection, I realised that all these clients truncate domain field in authentication setting to some arbitrary length. Although I can't be sure what backend the Android applications used, the issue on Linux is certainly related to freerdp as rdesktop connects to the Windows machine without any problem.

The fix on the Fedora was straight forward, just downgrading remmina to the repository's 0.7.5 (older versions of remmina use rdesktop as the backend). And for the Android apps?

If you combine domain with username in the following way, you can get RDP working on Jump Desktop and PocketCloud Android applications (hackish but until the developers fix the bug):
<domain>\<username>
You should leave the domain field blank for it to work.

Links:
Jump Desktop
Pocket Cloud
Remmina