OCR and Android

On Device or In The Cloud ?

Before deciding on an OCR library, one needs to decide, where the OCR process should take place: on the Smartphone or in the Cloud. Each approach has its advantages.
On device OCR can be performed without requiring an Internet connection and instead of sending a photo, which can potentially be huge (many phones have 8 or 12 Mega-Pixel cameras now), the text is recognized by an on-board OCR-engine.
However, OCR-libraries tend to be large, i.e. the mobile application will be of considerable size. Depending on the amount of text that needs to be recognized and the available data transfer speed, a cloud-service may provide the result faster. A cloud-service can be updated more easily but individually optimizing (training) an OCR engine may work better when done locally on the device.

Which OCR Library to choose ?

After taking a closer look at the all comparisons, Tesseract stands out. It provides good accuracy, it’s open-source and Apache-Licensed, and has broad language support. It was created by HP and is now developed by Google.

Also, since Tesseract is open source and Apache- Licensed, we can take the source and port it to the Android platform, or put it on a Web-server to run our very own Cloud-service.

A Tesseract is a four- dimensional object, much like a cube is a three-dimensional object. A square has two dimensions. You can make a cube from six squares. A cube has three dimensions. The tesseract is made in the same way, but in four dimensions.

1. Tesseract

The Tesseract OCR engine was developed at Hewlett Packard Labs and is currently sponsored by Google. It was among the top three OCR engines in terms of character accuracy in 1995. http://code.google.com/p/tesseract-ocr/

1.1. Running Tesseract locally on a Mac

Like with so make other Unix and Linux tools, Homebrew (http://mxcl.github.com/homebrew/) is the easiest and most flexible way to install the UNIX tools Apple didn’t include with OS X. Once Homebrew is installed (https://github.com/mxcl/homebrew/wiki/installation), Tesseract can be installed on OS X as easy as:
$ brew install tesseract
Once installed,
$ brew info tesseract will return something like this:

tesseract 3.00

http://code.google.com/p/tesseract-ocr/

Depends on: libtiff
/usr/local/Cellar/tesseract/3.00 (316 files, 11M)
Tesseract is an OCR (Optical Character Recognition) engine.
The easiest way to use it is to convert the source to a Grayscale tiff:
`convert source.png -type Grayscale terre_input.tif`
then run tesseract:
`tesseract terre_input.tif output`

http://github.com/mxcl/homebrew/commits/master/Library/Formula/tesseract.rb


Tesseract doesn’t come with a GUI and instead runs from a command-line interface. To OCR a TIFF-encoded image located on your desktop, you would do something like this:
$ tesseract ~/Desktop/cox.tiff ~/Desktop/cox
Using the image below, Tesseract wrote with perfect accuracy the resulting text into
~/Desktop/cox.txt

There are at least two projects, providing a GUI-front-end for Tesseract on OS X

  1. TesseractGUI, a native OSX client: http://download.dv8.ro/files/TesseractGUI/
  2. VietOCR, a Java Client: http://vietocr.sourceforge.net/

1.2. Running Tesseract as a Could-Service on a Linux Server

One of the fastest and easiest ways to deploy Tesseract as a Web-service, uses Tornado (http://www.tornadoweb.org/), an open source (Apache Licensed) Python non-blocking web server. Since Tesseract accepts TIFF encoded images but our Cloud-Service should rather work with the more popular JPEG image format, we also need to deploy the free Python Imaging Library (http://www.pythonware.com/products/pil/), license terms are here: http://www.pythonware.com/products/pil/license.htm

The deployment on Ubuntu 11.10 64-bit server looks something like this:

sudo apt-get install python-tornado
sudo apt-get install python-imaging
sudo apt-get install tesseract-ocr

1.2.1. The HTTP Server-Script for port 8080

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#!/usr/bin/env python
import tornado.httpserver
import tornado.ioloop
import tornado.web
import pprint
import Image
from tesseract import image_to_string
import StringIO
import os.path
import uuid
class MainHandler(tornado.web.RequestHandler):
    def get(self):
        self.write('</pre>
<form action="/" method="post" enctype="multipart/form-data">' '
<input type="file" name="the_file" />' '
<input type="submit" value="Submit" />' '</form>
<pre class="prettyprint">')
    def post(self):
        self.set_header("Content-Type", "text/html")
    self.write("") # create a unique ID file
        tempname = str(uuid.uuid4()) + ".jpg"
        myimg = Image.open(StringIO.StringIO(self.request.files.items()[0][1][0  ['body']))
        myfilename = os.path.join(os.path.dirname(__file__),"static",tempname);
        # save image to file as JPEG
        myimg.save(myfilename)
        # do OCR, print result
        self.write(image_to_string(myimg))
        self.write("")
settings = {
    "static_path": os.path.join(os.path.dirname(__file__), "static"),
}
application = tornado.web.Application([
    (r"/", MainHandler),
], **settings)
if __name__ == "__main__":
    http_server = tornado.httpserver.HTTPServer(application)
    http_server.listen(8080)
    tornado.ioloop.IOLoop.instance().start()

The Server receives a JPEG image file and stores it locally in the ./static directory, before calling image_to_string, which is defined in the Python script below:

1.2.2. image_to_string function implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
#!/usr/bin/env python
tesseract_cmd = 'tesseract'
import Image
import StringIO
import subprocess
import sys
import os
__all__ = ['image_to_string']
def run_tesseract(input_filename, output_filename_base, lang=None, boxes=False):
    '''
    runs the command:
        `tesseract_cmd` `input_filename` `output_filename_base`
    returns the exit status of tesseract, as well as tesseract's stderr output
    '''
    command = [tesseract_cmd, input_filename, output_filename_base]
    if lang is not None:
        command += ['-l', lang]
    if boxes:
        command += ['batch.nochop', 'makebox']
    proc = subprocess.Popen(command,
            stderr=subprocess.PIPE)
    return (proc.wait(), proc.stderr.read())
def cleanup(filename):
    ''' tries to remove the given filename. Ignores non-existent files '''
    try:
        os.remove(filename)
    except OSError:
        pass
def get_errors(error_string):
    '''
    returns all lines in the error_string that start with the string "error"
    '''
    lines = error_string.splitlines()
    error_lines = tuple(line for line in lines if line.find('Error') >= 0)
    if len(error_lines) > 0:
        return '\n'.join(error_lines)
    else:
        return error_string.strip()
def tempnam():
    ''' returns a temporary file-name '''
    # prevent os.tmpname from printing an error...
    stderr = sys.stderr
    try:
        sys.stderr = StringIO.StringIO()
        return os.tempnam(None, 'tess_')
    finally:
        sys.stderr = stderr
class TesseractError(Exception):
    def __init__(self, status, message):
        self.status = status
        self.message = message
        self.args = (status, message)
def image_to_string(image, lang=None, boxes=False):
    '''
    Runs tesseract on the specified image. First, the image is written to disk,
    and then the tesseract command is run on the image. Resseract's result is
    read, and the temporary files are erased.
    '''
    input_file_name = '%s.bmp' % tempnam()
    output_file_name_base = tempnam()
    if not boxes:
        output_file_name = '%s.txt' % output_file_name_base
    else:
        output_file_name = '%s.box' % output_file_name_base
    try:
        image.save(input_file_name)
        status, error_string = run_tesseract(input_file_name,
                                             output_file_name_base,
                                             lang=lang,
                                             boxes=boxes)
        if status:
            errors = get_errors(error_string)
            raise TesseractError(status, errors)
        f = file(output_file_name)
        try:
            return f.read().strip()
        finally:
            f.close()
    finally:
        cleanup(input_file_name)
        cleanup(output_file_name)
if __name__ == '__main__':
    if len(sys.argv) == 2:
        filename = sys.argv[1]
        try:
            image = Image.open(filename)
        except IOError:
            sys.stderr.write('ERROR: Could not open file "%s"\n' % filename)
            exit(1)
        print image_to_string(image)
    elif len(sys.argv) == 4 and sys.argv[1] == '-l':
        lang = sys.argv[2]
        filename = sys.argv[3]
        try:
            image = Image.open(filename)
        except IOError:
            sys.stderr.write('ERROR: Could not open file "%s"\n' % filename)
            exit(1)
        print image_to_string(image, lang=lang)
    else:
        sys.stderr.write('Usage: python tesseract.py [-l language] input_file\n')
        exit(2)

1.2.3. The Service deploy/start Script

1
2
3
4
5
6
7
8
9
10
11
12
13
14
description  "OCR WebService"
start on runlevel [2345]
stop on runlevel [!2345]
pre-start script
mkdir /tmp/ocr
mkdir /tmp/ocr/static
cp /usr/share/ocr/*.py /tmp/ocr
end script
exec /tmp/ocr/tesserver.py

After the service has been started, it can be accessed through a Web browser like shown here: http://proton.techcasita.com:8080 I’m currently running tesseract 3.01 on Ubuntu Linux 11.10 64-bit, please be gentle, it runs on an Intel Atom CPU 330 @ 1.60GHz, 4 cores (typically found in Netbooks)The HTML encoded result looks something like this:

1
2
3
4
5
<html><body>Contact Us
www. cox.com
Customer Serv <span class="skype_c2c_print_container notranslate">760-788-9000</span><span id="skype_c2c_container" class="skype_c2c_container notranslate" dir="ltr" tabindex="-1" onmouseover="SkypeClick2Call.MenuInjectionHandler.showMenu(this, event)" onmouseout="SkypeClick2Call.MenuInjectionHandler.hideMenu(this, event)" onclick="SkypeClick2Call.MenuInjectionHandler.makeCall(this, event)" data-numbertocall="+17607889000" data-isfreecall="false" data-isrtl="false" data-ismobile="false"><span class="skype_c2c_highlighting_inactive_common" dir="ltr" skypeaction="skype_dropdown"><span class="skype_c2c_textarea_span" id="non_free_num_ui"><img width="0" height="0" class="skype_c2c_logo_img" src="chrome-extension://lifbcibllhkdhoafpjfnlhfpfgnpldfl/call_skype_logo.png"><span class="skype_c2c_text_span">760-788-9000</span><span class="skype_c2c_free_text_span"></span></span></span></span>
Repair 76O—788~71O0
Cox Telephone <span class="skype_c2c_print_container notranslate">888-222-7743</span><span id="skype_c2c_container" class="skype_c2c_container notranslate" dir="ltr" tabindex="-1" onmouseover="SkypeClick2Call.MenuInjectionHandler.showMenu(this, event)" onmouseout="SkypeClick2Call.MenuInjectionHandler.hideMenu(this, event)" onclick="SkypeClick2Call.MenuInjectionHandler.makeCall(this, event)" data-numbertocall="+18882227743" data-isfreecall="true" data-isrtl="false" data-ismobile="false"><span class="skype_c2c_highlighting_inactive_common" dir="ltr" skypeaction="skype_dropdown"><span class="skype_c2c_textarea_span" id="free_num_ui"><img width="0" height="0" class="skype_c2c_logo_img" src="chrome-extension://lifbcibllhkdhoafpjfnlhfpfgnpldfl/call_skype_logo.png"><span class="skype_c2c_text_span">888-222-7743</span><span class="skype_c2c_free_text_span"> FREE</span></span></span></span></body></html>

1.3 Accessing the Tesseract Cloud-Service from Android

The OCRTaskActivity below utilizes Android’s built-in AsyncTask as well as Apache Software Foundation’s HttpComponent library HttpClient4.1.2, available here: http://hc.apache.org/httpcomponents-client-ga/index.html OCRTaskActivity expects the image to be passed in as the Intent Extra “ByteArray” of type ByteArray. The OCR result is returned to the calling Activity as OCR_TEXT, like shown here:

setResult(Activity.RESULT_OK, getIntent().putExtra("OCR_TEXT", result));
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
import android.app.Activity;
import android.graphics.BitmapFactory;
import android.os.AsyncTask;
import android.os.Bundle;
import android.util.Log;
import android.view.View;
import android.widget.ImageView;
import android.widget.ProgressBar;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.mime.HttpMultipartMode;
import org.apache.http.entity.mime.MultipartEntity;
import org.apache.http.entity.mime.content.ByteArrayBody;
import org.apache.http.entity.mime.content.StringBody;
import org.apache.http.impl.client.DefaultHttpClient;
import java.io.BufferedReader;
import java.io.InputStreamReader;
public class OCRTaskActivity extends Activity {
    private static String LOG_TAG = OCRAsyncTaskActivity.class.getSimpleName();
    private static String[] URL_STRINGS = {"http://proton.techcasita.com:8080"};
    private byte[] mBA;
    private ProgressBar mProgressBar;
    @Override
    public void onCreate(final Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.ocr);
        mBA = getIntent().getExtras().getByteArray("ByteArray");
        ImageView iv = (ImageView) findViewById(R.id.ImageView);
        iv.setImageBitmap(BitmapFactory.decodeByteArray(mBA, 0, mBA.length));
        mProgressBar = (ProgressBar) findViewById(R.id.progressBar);
        OCRTask task = new OCRTask();
        task.execute(URL_STRINGS);
    }
    private class OCRTask extends AsyncTask {
        @Override
        protected String doInBackground(final String... urls) {
            String response = "";
            for (String url : urls) {
                try {
                    response = executeMultipartPost(url, mBA);
                    Log.v(LOG_TAG, "Response:" + response);
                    break;
                } catch (Throwable ex) {
                    Log.e(LOG_TAG, "error: " + ex.getMessage());
                }
            }
            return response;
        }
        @Override
        protected void onPostExecute(final String result) {
            mProgressBar.setVisibility(View.GONE);
            setResult(Activity.RESULT_OK, getIntent().putExtra("OCR_TEXT", result));
            finish();
        }
    }
    private String executeMultipartPost(final String stringUrl, final byte[] bm) throws Exception {
        HttpClient httpClient = new DefaultHttpClient();
        HttpPost postRequest = new HttpPost(stringUrl);
        ByteArrayBody bab = new ByteArrayBody(bm, "the_image.jpg");
        MultipartEntity reqEntity = new MultipartEntity(HttpMultipartMode.BROWSER_COMPATIBLE);
        reqEntity.addPart("uploaded", bab);
        reqEntity.addPart("name", new StringBody("the_file"));
        postRequest.setEntity(reqEntity);
        HttpResponse response = httpClient.execute(postRequest);
        BufferedReader reader = new BufferedReader(new InputStreamReader(response.getEntity().getContent(), "UTF-8"));
        String sResponse;
        StringBuilder s = new StringBuilder();
        while ((sResponse = reader.readLine()) != null) {
            s = s.append(sResponse).append('\n');
        }
        int i = s.indexOf("body");
        int j = s.lastIndexOf("body");
        return s.substring(i + 5, j - 2);
    }
}

1.4. Building a Tesseract native Android Library to be bundled with an Android App

This approach allow an Android application to perform OCR even without a network connection. I.e. the OCR engine is on-board. There are currently two source-bases to start from, the original Tesseract project here:

  1. Tesseract Tools for Android is a set of Android APIs and build files for the Tesseract OCR and Leptonica image processing libraries:
    svn checkout http://tesseract-android-tools.googlecode.com/svn/trunk/ tesseract-android-tools
  2. A fork of Tesseract Tools for Android (tesseract-android-tools) that adds some additional functions:
    git clone git://github.com/rmtheis/tess-two.git

… I went with option 2.

1.4.1. Building the native lib

Each project can be build with the same build steps (see below) and neither works with Android’s NDK r7. However, going back to NDK r6b solved that problem. Here are the build steps. It takes a little while, even on a fast machine.

1
2
3
4
5
6
7
cd <project-directory>/tess-two
export TESSERACT_PATH=${PWD}/external/tesseract-3.01
export LEPTONICA_PATH=${PWD}/external/leptonica-1.68
export LIBJPEG_PATH=${PWD}/external/libjpeg
ndk-build
android update project --path .
ant release

The build-steps create the native libraries in the libs/armabi and libs/armabi-v7a directories.

The tess-two project can now be included as a library-project into an Android project and with the JNI layer in place, calling into the native OCR library now looks something like this:

1.4.2. Developing a simple Android App with built-in OCR capabilities

1
2
3
4
5
6
7
...
TessBaseAPI baseApi = new TessBaseAPI();
baseApi.init(DATA_PATH, LANG);
baseApi.setImage(bitmap);
String recognizedText = baseApi.getUTF8Text();
baseApi.end();
...

1.4.2.1. Libraries / TrainedData / App Size

The native libraries are about 3 MBytes in size. Additionally, a language and font depending training resource files is needed.
The eng.traineddata file (e.g. available with the desktop version of Tesseract) is placed into the main android’s assers/tessdata folder and deployed with the application, adding another 2 MBytes to the app. However, due to compression, the actual downloadable Android application is “only” about 4.1 MBytes.

During the first start of the application, the eng.traineddata resource file is copied to the phone’s SDCard.

The ocr() method for the sample app may look something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
protected void ocr() {
        BitmapFactory.Options options = new BitmapFactory.Options();
        options.inSampleSize = 2;
        Bitmap bitmap = BitmapFactory.decodeFile(IMAGE_PATH, options);
        try {
            ExifInterface exif = new ExifInterface(IMAGE_PATH);
            int exifOrientation = exif.getAttributeInt(ExifInterface.TAG_ORIENTATION, ExifInterface.ORIENTATION_NORMAL);
            Log.v(LOG_TAG, "Orient: " + exifOrientation);
            int rotate = 0;
            switch (exifOrientation) {
                case ExifInterface.ORIENTATION_ROTATE_90:
                    rotate = 90;
                    break;
                case ExifInterface.ORIENTATION_ROTATE_180:
                    rotate = 180;
                    break;
                case ExifInterface.ORIENTATION_ROTATE_270:
                    rotate = 270;
                    break;
            }
            Log.v(LOG_TAG, "Rotation: " + rotate);
            if (rotate != 0) {
                // Getting width & height of the given image.
                int w = bitmap.getWidth();
                int h = bitmap.getHeight();
                // Setting pre rotate
                Matrix mtx = new Matrix();
                mtx.preRotate(rotate);
                // Rotating Bitmap
                bitmap = Bitmap.createBitmap(bitmap, 0, 0, w, h, mtx, false);
                // tesseract req. ARGB_8888
                bitmap = bitmap.copy(Bitmap.Config.ARGB_8888, true);
            }
        } catch (IOException e) {
            Log.e(LOG_TAG, "Rotate or coversion failed: " + e.toString());
        }
        ImageView iv = (ImageView) findViewById(R.id.image);
        iv.setImageBitmap(bitmap);
        iv.setVisibility(View.VISIBLE);
        Log.v(LOG_TAG, "Before baseApi");
        TessBaseAPI baseApi = new TessBaseAPI();
        baseApi.setDebug(true);
        baseApi.init(DATA_PATH, LANG);
        baseApi.setImage(bitmap);
        String recognizedText = baseApi.getUTF8Text();
        baseApi.end();
        Log.v(LOG_TAG, "OCR Result: " + recognizedText);
        // clean up and show
        if (LANG.equalsIgnoreCase("eng")) {
            recognizedText = recognizedText.replaceAll("[^a-zA-Z0-9]+", " ");
        }
        if (recognizedText.length() != 0) {
            ((TextView) findViewById(R.id.field)).setText(recognizedText.trim());
        }
    }

OCR on Android

The popularity of smartphones, combined with built-in high-quality cameras has created a new category of mobile applications, benefiting greatly from OCR.

OCR is very mature technology with a broad range of available libraries to chose from. There are Apache and BSD licensed, fast and accurate solutions available from the open-source community, I have taken a closer look at Tesseract, which is developed by HP and Google.

Tesseract can be used to build a Desktop application, a CloudService, and even baked into a mobile Android application, performing on-board OCR. All three variation of OCR with the Tesseract library have been demonstrated above.

Focussing on mobile applications, however, it became very clear that even on phones with a 5MP camera, the accuracy of the results still vary greatly, depending on lighting conditions, font, and font-sizes, as well as surrounding artifact.

Just like with the TeleForm application, even the best OCR engines perform purely, if the input-image has not been prepared correctly. To make OCR work on a mobile device, no matter if the OCR will eventually be run onboard or in the cloud, much development time needs to be spend to train the engine – but even more importantly, to select and prepare the image areas that will be provided as input to the OCR engine – it’s going to be all about the pre-processing.

 

Reference: http://wolfpaulus.com/jounal/android-journal/android-and-ocr/

Top 5 Tools for network security monitoring

Security data can be found on virtually all systems in a corporate network. However, all systems do not provide equally valuable security context. While monitoring everything would be ideal, this is impractical for most organizations due to resource constraints. So what data sources should you prioritize to make the most of your monitoring efforts?

When it comes to security monitoring, context is the key. The more relevant security context you have, the more likely it is you will successfully detect real security incidents while weeding out false positives (e.g. non-threats). In determining which devices and systems to monitor for security data, the first priority is to give yourself as much useful context as possible.

Based on a decade of monitoring experience, SecureWorks believes the top five sources of security context are:

Number One: Network-based Intrusion Detection and Prevention Systems (NIDS/NIPS)

NIDS and NIPS devices use signatures to detect security events on your network. Performing full packet inspection of network traffic at the perimeter or across key network segments, most NIDS/NIPS devices provide detailed alerts that help to detect:

  • Known vulnerability exploit attempts
  • Known Trojan activity
  • Anomalous behavior (depending on the IDS/IPS)
  • Port and Host scans

Number Two: Firewalls

Serving as the network’s gatekeeper, firewalls allow and log incoming and outgoing network connections based on your policies. Some firewalls also have basic NIDS/NIPS signatures to detect security events. Monitoring firewall logs and alerts helps to detect:

  • New and unknown threats, such as custom Trojan activity
  • Port and Host scans
  • Worm outbreaks
  • Minor anomalous behavior
  • Most any activity denied by firewall policy

Number Three: Host-based Intrusion Detection and Prevention Systems (HIDS/HIPS)

Like NIDS/NIPS, host-based intrusion detection and prevention systems utilize signatures to detect security events. But instead of inspecting network traffic, HIDS/HIPS agents are installed on servers to directly alert on security activity. Monitoring HIDS/HIPS alerts helps to detect:

  • Known vulnerability exploit attempts
  • Console exploit attempts
  • Exploit attempts performed over encrypted channels
  • Password grinding (manual or automated attempts to guess passwords)
  • Anomalous behavior by users or applications

Number Four: Network Devices with Access Control Lists (ACLs)

Network devices that can use ACLs, such as routers and VPN servers, have the ability to control network traffic based on permitted networks and hosts. Monitoring logs from devices with ACLs helps to detect:

  • New and unknown threats, such as custom Trojan activity
  • Port and Host scans
  • Minor anomalous behavior
  • Most anything denied by the ACL’s

Number Five: Server and Application Logs

Many types of servers and applications log events such as login attempts and user activity. Depending on the extent of logging capabilities, monitoring server and application logs can help to detect:

  • Known and unknown exploit attempts
  • Password Grinding
  • Anomalous behavior by users or applications

It is important to understand that the incremental value of a data source will vary from situation to situation. A source’s purpose, its location in your network and the quality of the data it provides are a few of the many variables that must be considered when planning your security monitoring strategy.

Keep in mind that there are many other security technologies, network devices and log sources throughout your IT environment that may also provide beneficial context to your security monitoring efforts. For example, Unified Threat Management (UTM) devices which combine firewall, NIDS/NIPS and other capabilities onto a single device can be monitored to detect similar events as standalone firewalls and NIDS/NIPS devices.

By monitoring the assets that provide the highest value security context, you can optimize security monitoring efforts. Doing so will provide faster, more accurate detection of threats while making the most of your security resources. For additional information on monitoring security events and other security topics, please visit theSecureWorks website.

 

Featured Gartner Research:

What Organizations are Spending on IT Security

According to research and advisory firm Gartner Inc., “Many CIOs and chief information security officers (CISOs) are uncertain about what is a ‘normal’ level of security spending in terms of a percentage of the overall IT budget – especially during economic uncertainty.” This research note will help IT managers understand how organizations are investing in their information security and compare their spending with that of their peers.

View the complimentary Gartner report made available to you by SecureWorks.

 

Security 101: Web Application Firewalls

What is a Web Application Firewall?
A web application firewall (WAF) is a tool designed to protect externally-facing web applications used for online banking, Internet retail sales, discussion boards and many other functions from application layer attacks such as cross-site scripting (XSS), cross-site request forgery (XSRF) and SQL injection. Because web application attacks exploit flaws in application logic that is often developed internally, each attack is unique to its target application. This makes it difficult to detect and prevent application layer attacks using existing defenses such as network firewalls and NIDS/NIPS.

How do WAFs Work?
WAFs utilize a set of rules or policies to control communications to and from a web application. These rules are designed to block common application layer attacks. Architecturally, a WAF is deployed in front of an application to intercept communications and enforce policies before they reach the application.

What are the Risks of Deploying a WAF?

Depending on the importance of the web application to your business, the risk of experiencing false positives that interrupt legitimate communications can be a concern. To provide sound protection with minimal false positives, WAF rules and policies must be tailored to the application(s) the WAF is defending. In many cases, this requires significant up-front customization based on in-depth knowledge of the application in question. This effort must also be maintained to address modifications to the application over time.

What are the Benefits of Deploying a WAF?

A WAF can be beneficial in terms of both security and compliance. Applications are a prime target for today’s hackers. Also, the Payment Card Industry (PCI) Data Security Standard requires companies who process, store or transmit payment card data to protect their externally-facing web applications from known attacks (Requirement 6.6). If managed properly and used in conjunction with regular application code reviews, vulnerability testing and remediation, WAFs can be a solid option for protecting against web application attacks and satisfying related compliance requirements.

 

Reference: http://www.secureworks.com/resources/newsletter/2008-07/

NIDS (Network Intrusion Detection System) and NIPS (Network Intrusion Prevention System)

NIDS and NIPS (Behavior based, signature based, anomaly based, heuristic)

An intrusion detection system (IDS) is software that runs on a server or network device to monitor and track network activity. By using an IDS, a network administrator can configure the system to monitor network activity for suspicious behavior that can indicate unauthorized access attempts. IDSs can be configured to evaluate system logs, look at suspicious network activity, and disconnect sessions that appear to violate security settings.

IDSs can be sold with firewalls. Firewalls by themselves will prevent many common attacks, but they don’t usually have the intelligence or the reporting capabilities to monitor the entire network. An IDS, in conjunction with a firewall, allows both a reactive posture with the firewall and a preventive posture with the IDS.

In response to an event, the IDS can react by disabling systems, shutting down ports, ending sessions, deception (redirect to honeypot), and even potentially shutting down your network. A network-based IDS that takes active steps to halt or prevent an intrusion is called a network intrusion prevention system (NIPS). When operating in this mode, they are considered active systems.

Passive detection systems log the event and rely on notifications to alert administrators of an intrusion. Shunning or ignoring an attack is an example of a passive response, where an invalid attack can be safely ignored. A disadvantage of passive systems is the lag between intrusion detection and any remediation steps taken by the administrator.

Intrusion prevention systems (IPS) like IDSs follows the same process of gathering and identifying data and behavior, with the added ability to block (prevent) the activity.

A network-based IDS examines network patters, such as an unusual number or requests destined for a particular server or service, such as an FTP server. Network IDS systems should be located as upfront as possible, e.g. on the firewall, a network tap, span port, or hub, to monitor external traffic. Host IDS systems on the other hand, are placed on individual hosts where they can more efficiently monitor internally generated events.

Using both network and host IDS enhances the security of the environment.

Snort is an example of a network intrusion detection and prevention system. It conducts traffic analysis and packet logging on IP networks. Snort uses a flexible rule-based language to describe traffic that it should collect or pass, and a modular detection engine.

Network based intrusion detection attempts to identify unauthorized, illicit, and anomalous behavior based solely on network traffic. Using the captured data, the Network IDS processes and flags any suspicious traffic. Unlike an intrusion prevention system, an intrusion detection system does not actively block network traffic. The role of a network IDS is passive, only gathering, identifying, logging and alerting.

Host based intrusion detection system (HIDS) attempts to identify unauthorized, illicit, and anomalous behavior on a specific device. HIDS generally involves an agent installed on each system, monitoring and alerting on local OS and application activity. The installed agent uses a combination of signatures, rules, and heuristics to identify unauthorized activity. The role of a host IDS is passive, only gathering, identifying, logging, and alerting. Tripwire is an example of a HIDS.

There are no fully mature open standards for ID at present. The Internet Engineering Task Force (IETF) is the body which develops new Internet standards. They have a working group to develop a common format for IDS alerts.

The following types of monitoring methodologies can be used to detect intrusions and malicious behavior: signature, anomaly, heuristic and rule-based monitoring.

A signature based IDS will monitor packets on the network and compare them against a database of signatures or attributes from known malicious threats. This is similar to the way most antivirus software detects malware. The issue is that there will be a lag between a new threat being discovered in the wild and the signature for detecting that threat being applied to your IDS.

A network IDS signature is a pattern that we want to look for in traffic. Signatures range from very simple – checking the value of a header field – to highly complex signatures that may actually track the state of a connection or perform extensive protocol analysis.

An anomaly-based IDS examines ongoing traffic, activity, transactions, or behavior for anomalies (things outside the norm) on networks or systems that may indicate attack. An IDS which is anomaly based will monitor network traffic and compare it against an established baseline. The baseline will identify what is “normal” for that network, what sort of bandwidth is generally used, what protocols are used, what ports and devices generally connect to each other, and alert the administrator when traffic is detected which is anomalous to the baseline.

A heuristic-based security monitoring uses an initial database of known attack types but dynamically alters their signatures base on learned behavior of network traffic. A heuristic system uses algorithms to analyze the traffic passing through the network. Heuristic systems require more fine-tuning to prevent false positives in your network.

A behavior-based system looks for variations in behavior such as unusually high traffic, policy violations, and so on. By looking for deviations in behavior, it is able to recognize potential threats and respond quickly.
Similar to firewall access control rules, a rule-based security monitoring system relies on the administrator to create rules and determine the actions to take when those rules are transgressed.

References:
http://netsecurity.about.com/cs/hackertools/a/aa030504.htm
http://www.sans.org/security-resources/idfaq/
• CompTIA Security+ Study Guide: Exam SY0-301, Fifth Edition by Emmett Dulaney
• Mike Meyers’ CompTIA Security+ Certification Passport, Second Edition by T. J. Samuelle

http://neokobo.blogspot.com/2012/01/118-nids-and-nips.html

UI, UX: Designing functinality in Tech Industry

Design is a rather broad and vague term. When someone says “I’m a designer,” it is not immediately clear what they actually do day to day. There are a number of different responsibilities encompassed by the umbrella term designer.

Design-related roles exist in a range of areas from industrial design (cars, furniture) to print (magazines, other publications) to tech (websites, mobile apps). With the relatively recent influx of tech companies focused on creating interfaces for screens, many new design roles have emerged. Job titles like UX or UI designer are confusing to the uninitiated and unfamiliar even to designers who come from other industries.

Let’s attempt to distill what each of these titles really mean within the context of the tech industry.

UX DESIGNER (USER EXPERIENCE DESIGNER)

UX designers are primarily concerned with how the product feels. A given design problem has no single right answer. UX designers explore many different approaches to solving a specific user problem. The broad responsibility of a UX designer is to ensure that the product logically flows from one step to the next. One way that a UX designer might do this is by conducting in-person user tests to observe one’s behavior. By identifying verbal and non-verbal stumbling blocks, they refine and iterate to create the “best” user experience. An example project is creating a delightful onboarding flow for a new user.

“Define interaction models, user task flows, and UI specifications. Communicate scenarios, end-to-end experiences, interaction models, and screen designs to stakeholders. Work with our creative director and visual designers to incorporate the visual identity of Twitter into features. Develop and maintain design wireframes, mockups, and specifications as needed.”

Experience Designer job description at Twitter

Example of an app’s screens created by a UX designer.Credit: Kitchenware Pro Wireframe Kit by Neway Lau on Dribbble.

Deliverables: Wireframes of screens, storyboards, sitemap

Tools of the trade: Photoshop, Sketch, Illustrator, Fireworks, InVision

You might hear them say this in the wild
: “We should show users the ‘Thank You’ page once they have finished signing up.”

UI DESIGNER (USER INTERFACE DESIGNER)

Unlike UX designers who are concerned with the overall feel of the product, user interface designers are particular about how the product is laid out. They are in charge of designing each screen or page with which a user interacts and ensuring that the UI visually communicates the path that a UX designer has laid out. For example, a UI designer creating an analytics dashboard might front load the most important content at the top, or decide whether a slider or a control knob makes the most intuitive sense to adjust a graph. UI designers are also typically responsible for creating a cohesive style guide and ensuring that a consistent design language is applied across the product. Maintaining consistency in visual elements and defining behavior such as how to display error or warning states fall under the purview of a UI designer.

“Concept and implement the visual language of Airbnb.com. Create and advance site-wide style guides.”

-UI Designer job description at Airbnb

The boundary between UI and UX designers is fairly blurred and it is not uncommon for companies to opt to combine these roles.

A UI designer defines the overall layout and look & feel of an app.Credit: Metro Style Interface 4 by Ionut Zamfir on Dribbble.

Tools of the trade: Photoshop, Sketch, Illustrator, Fireworks

You might hear them say this in the wild: “The login and sign up links should be moved to the top right corner.”

VISUAL DESIGNER (GRAPHIC DESIGNER)

A visual designer is the one who pushes pixels. If you ask a non-designer what a designer does, this is probably what comes to mind first. Visual designers are not concerned with how screens link to each other, nor how someone interacts with the product. Instead, their focus is on crafting beautiful icons, controls, and visual elements and making use of suitable typography. Visual designers sweat the small details that others overlook and frequently operate at the 4X to 8X zoom level in Photoshop.

“Produce high-quality visual designs—from concept to execution, including those for desktop, web, and mobile devices at a variety of resolutions (icons, graphics, and marketing materials). Create and iterate on assets that reflect a brand, enforce a language, and inject beauty and life into a product.”

Visual Designer job description at Google

It is also fairly common for UI designers to pull double duty and create the final pixel perfect assets. Some companies choose not to have a separate visual designer role.

A visual designer lays out guides and adjusts every single pixel to ensure that the end result is perfect.Credits: iOS 7 Guide Freebie PSD by Seevi kargwal on Dribbble.

Tools of the trade: Photoshop, Sketch

You might hear them say this in the wild: “The kerning is off and the button should be 1 pixel to the left!”

INTERACTION DESIGNER (MOTION DESIGNER)

Remember the subtle bouncing animation when you pull to refresh in the Mail app on your iPhone? That’s the work of a motion designer. Unlike visual designers who usually deal with static assets, motion designers create animation inside an app. They deal with what the interface does after a user touches it. For example, they decide how a menu should slide in, what transition effects to use, and how a button should fan out. When done well, motion becomes an integral part of the interface by providing visual clues as to how to use the product.

“Proficiency in graphic design, motion graphics, digital art, a sensitivity to typography and color, a general awareness of materials/textures, and a practical grasp of animation. Knowledge of iOS, OS X, Photoshop and Illustrator as well as familiarity with Director (or equivalent), Quartz Composer (or equivalent), 3D computer modeling, motion graphics are required.”

-Interaction Designer job description at Apple

Tools of the trade: AfterEffects, Core Composer, Flash, Origami

You might hear them say this in the wild:”The menu should ease-in from the left in 800ms.”

UX RESEARCHER (USER RESEARCHER)

A UX researcher is the champion of a user’s needs. The goal of a researcher is to answer the twin questions of “Who are our users?” and “What do our users want?” Typically, this role entails interviewing users, researching market data, and gathering findings. Design is a process of constant iteration. Researchers may assist with this process by conducting A/B tests to tease out which design option best satisfies user needs. UX researchers are typically mainstays at large companies, where the access to a plethora of data gives them ample opportunity to draw statistically significant conclusions.

“Work closely with product teams to identify research topics. Design studies that address both user behavior and attitudes. Conduct research using a wide variety of qualitative methods and a subset of quantitative methods, such as surveys.”

UX Researcher job description at Facebook

UX designers also occasionally carry out the role of UX researchers.

Deliverables: User personas, A/B test results, Investigative user studies & interviews

Tools of the trade: Mic, Paper, Docs

You might hear them say this in the wild: “From our research, a typical user…”

FRONT-END DEVELOPER (UI DEVELOPER)

Front-end developers are responsible for creating a functional implementation of a product’s interface. Usually, a UI designer hands off a static mockup to the front-end developer who then translates it into a working, interactive experience. Front-end developers are also responsible for coding the visual interactions that the motion designer comes up with.

Tools of the trade: CSS, HTML, JavaScript

You might hear them say this in the wild: “I’m using a 960px 12-column grid system.”

PRODUCT DESIGNER

Product designer is a catch-all term used to describe a designer who is generally involved in the creation of the look and feel of a product.

The role of a product designer isn’t well-defined and differs from one company to the next. A product designer may do minimal front-end coding, conduct user research, design interfaces, or create visual assets. From start to finish, a product designer helps identify the initial problem, sets benchmarks to address it, and then designs, tests, and iterates on different solutions. Some companies that want more fluid collaboration within the various design roles opt to have this title to encourage the whole design team to collectively own the user experience, user research, and visual design elements.

Some companies use “UX designer” or simply “designer” as a catch-all term. Reading the job description is the best way to figure out how the company’s design team divides the responsibilities.

“Own all facets of design: interaction, visual, product, prototyping. Create pixel-perfect mocks and code for new features across web and mobile.”

Product Designer job description at Pinterest

“I AM LOOKING FOR A DESIGNER”

This is the single most common phase I hear from new startups. What they are usually looking for is someone who can do everything described above. They want someone who can make pretty icons, create A/B tested landing sites, logically arrange UI elements on screen, and maybe even do some front-end development. Due to the broad sweeping scope of this role, we usually hear smaller companies asking to hire a “designer” rather than being specific in their needs.

The boundaries between each of these various design roles are very fluid. Some UX designers are also expected to do interaction design, and often UI designers are expected to push pixels as well. The best way to look for the right person is to describe what you expect the designer to do within your company’s process, and choose a title that best represents the primary task of that person.

OWL (Web Ontology Language)

1 Introduction

The Semantic Web is a vision for the future of the Web in which information is given explicit meaning, making it easier for machines to automatically process and integrate information available on the Web. The Semantic Web will build on XML’s ability to define customized tagging schemes [XML] and RDF’s flexible approach to representing data[RDF Concepts]. The next element required for the Semantic Web is a web ontology language which can formally describe the semantics of classes and properties used in web documents. In order for machines to perform useful reasoning tasks on these documents, the language must go beyond the basic semantics of RDF Schema [RDF Vocabulary].

This document is one part of the specification of OWL, the Web Ontology Language. The Document Roadmap section of the OWL Overview document describes each of the other documents. This document enumerates the requirements of a web ontology language as perceived by the working group. However, it is expected that future languages will extend OWL, adding, among other things, greater logical capabilities and the ability to establish trust on the Semantic Web.

We motivate the need for a web ontology language by describing six use cases. Some of these use cases are based on efforts currently underway in industry and academia, others demonstrate more long-term possibilities. The use cases are followed by design goals that describe high-level objectives and guidelines for the development of the language. These design goals will be considered when evaluating proposed features. The section on Requirements presents a set of features that should be in the language and gives motivations for those features. The Objectives section describes a list of features that might be useful for many use cases but may not necessarily be addressed by the working group.

The Web Ontology Working Group charter tasks the group to produce this more expressive semantics and to specify mechanisms by which the language can provide “more complex relationships between entities including: means to limit the properties of classes with respect to number and type, means to infer that items with various properties are members of a particular class, a well-defined model of property inheritance, and similar semantic extensions to the base languages.” The detailed specification of the web ontology language will take into consideration:

  • the design goals and requirements that are contained in this document
  • review comments on this document from public feedback, invited experts and working group members
  • specifications of or proposals for languages that meet many of these requirements

1.1 What is an ontology?

An ontology defines the terms used to describe and represent an area of knowledge. Ontologies are used by people, databases, and applications that need to share domain information (a domain is just a specific subject area or area of knowledge, like medicine, tool manufacturing, real estate, automobile repair, financial management, etc.). Ontologies include computer-usable definitions of basic concepts in the domain and the relationships among them (note that here and throughout this document, definition is not used in the technical sense understood by logicians). They encode knowledge in a domain and also knowledge that spans domains. In this way, they make that knowledge reusable.

The word ontology has been used to describe artifacts with different degrees of structure. These range from simple taxonomies (such as the Yahoo hierarchy), to metadata schemes (such as the Dublin Core), to logical theories. The Semantic Web needs ontologies with a significant degree of structure. These need to specify descriptions for the following kinds of concepts:

  • Classes (general things) in the many domains of interest
  • The relationships that can exist among things
  • The properties (or attributes) those things may have

Ontologies are usually expressed in a logic-based language, so that detailed, accurate, consistent, sound, and meaningful distinctions can be made among the classes, properties, and relations. Some ontology tools can perform automated reasoning using the ontologies, and thus provide advanced services to intelligent applications such as: conceptual/semantic search and retrieval, software agents, decision support, speech and natural language understanding, knowledge management, intelligent databases, and electronic commerce.

Ontologies figure prominently in the emerging Semantic Web as a way of representing the semantics of documents and enabling the semantics to be used by web applications and intelligent agents. Ontologies can prove very useful for a community as a way of structuring and defining the meaning of the metadata terms that are currently being collected and standardized. Using ontologies, tomorrow’s applications can be “intelligent,” in the sense that they can more accurately work at the human conceptual level.

Ontologies are critical for applications that want to search across or merge information from diverse communities. Although XML DTDs and XML Schemas are sufficient for exchanging data between parties who have agreed to definitions beforehand, their lack of semantics prevent machines from reliably performing this task given new XML vocabularies. The same term may be used with (sometimes subtle) different meaning in different contexts, and different terms may be used for items that have the same meaning. RDF and RDF Schema begin to approach this problem by allowing simple semantics to be associated with identifiers. With RDF Schema, one can define classes that may have multiple subclasses and super classes, and can define properties, which may have sub properties, domains, and ranges. In this sense, RDF Schema is a simple ontology language. However, in order to achieve interoperation between numerous, autonomously developed and managed schemas, richer semantics are needed. For example, RDF Schema cannot specify that the Person and Car classes are disjoint, or that a string quartet has exactly four musicians as members.

One of the goals of this document is to specify what is needed in a web ontology language. These requirements will be motivated by potential use cases and general design objectives that take into account the difficulties in applying the standard notion of ontologies to the unique environment of the Web.

1.2 Why OWL?

The Semantic Web is a vision for the future of the Web in which information is given explicit meaning, making it easier for machines to automatically process and integrate information available on the Web. The Semantic Web will build on XML’s ability to define customized tagging schemes and RDF’s flexible approach to representing data. The first level above RDF required for the Semantic Web is an ontology language what can formally describe the meaning of terminology used in Web documents. If machines are expected to perform useful reasoning tasks on these documents, the language must go beyond the basic semantics of RDF Schema. The OWL Use Cases and Requirements Document provides more details on ontologies, motivates the need for a Web Ontology Language in terms of six use cases, and formulates design goals, requirements andobjectives for OWL.

OWL has been designed to meet this need for a Web Ontology Language. OWL is part of the growing stack of W3C recommendations related to the Semantic Web.

  • XML provides a surface syntax for structured documents, but imposes no semantic constraints on the meaning of these documents.
  • XML Schema is a language for restricting the structure of XML documents and also extends XML with datatypes.
  • RDF is a datamodel for objects (“resources”) and relations between them, provides a simple semantics for this datamodel, and these datamodels can be represented in an XML syntax.
  • RDF Schema is a vocabulary for describing properties and classes of RDF resources, with a semantics for generalization-hierarchies of such properties and classes.
  • OWL adds more vocabulary for describing properties and classes: among others, relations between classes (e.g. disjointness), cardinality (e.g. “exactly one”), equality, richer typing of properties, characteristics of properties (e.g. symmetry), and enumerated classes.

1.3 The three sublanguages of OWL

OWL provides three increasingly expressive sublanguages designed for use by specific communities of implementers and users.

  • OWL Lite supports those users primarily needing a classification hierarchy and simple constraints. For example, while it supports cardinality constraints, it only permits cardinality values of 0 or 1. It should be simpler to provide tool support for OWL Lite than its more expressive relatives, and OWL Lite provides a quick migration path for thesauri and other taxonomies. Owl Lite also has a lower formal complexity than OWL DL, see the section on OWL Lite in the OWL Reference for further details.
  • OWL DL supports those users who want the maximum expressiveness while retaining computational completeness (all conclusions are guaranteed to be computable) and decidability (all computations will finish in finite time). OWL DL includes all OWL language constructs, but they can be used only under certain restrictions (for example, while a class may be a subclass of many classes, a class cannot be an instance of another class). OWL DL is so named due to its correspondence with description logics, a field of research that has studied the logics that form the formal foundation of OWL.
  • OWL Full is meant for users who want maximum expressiveness and the syntactic freedom of RDF with no computational guarantees. For example, in OWL Full a class can be treated simultaneously as a collection of individuals and as an individual in its own right. OWL Full allows an ontology to augment the meaning of the pre-defined (RDF or OWL) vocabulary. It is unlikely that any reasoning software will be able to support complete reasoning for every feature of OWL Full.

Each of these sublanguages is an extension of its simpler predecessor, both in what can be legally expressed and in what can be validly concluded. The following set of relations hold. Their inverses do not.

  • Every legal OWL Lite ontology is a legal OWL DL ontology.
  • Every legal OWL DL ontology is a legal OWL Full ontology.
  • Every valid OWL Lite conclusion is a valid OWL DL conclusion.
  • Every valid OWL DL conclusion is a valid OWL Full conclusion.

Ontology developers adopting OWL should consider which sublanguage best suits their needs. The choice between OWL Lite and OWL DL depends on the extent to which users require the more-expressive constructs provided by OWL DL. The choice between OWL DL and OWL Full mainly depends on the extent to which users require the meta-modeling facilities of RDF Schema (e.g. defining classes of classes, or attaching properties to classes). When using OWL Full as compared to OWL DL, reasoning support is less predictable since complete OWL Full implementations do not currently exist.

OWL Full can be viewed as an extension of RDF, while OWL Lite and OWL DL can be viewed as extensions of a restricted view of RDF. Every OWL (Lite, DL, Full) document is an RDF document, and every RDF document is an OWL Full document, but only some RDF documents will be a legal OWL Lite or OWL DL document. Because of this, some care has to be taken when a user wants to migrate an RDF document to OWL. When the expressiveness of OWL DL or OWL Lite is deemed appropriate, some precautions have to be taken to ensure that the original RDF document complies with the additional constraints imposed by OWL DL and OWL Lite. Among others, every URI that is used as a class name must be explicitly asserted to be of type owl:Class (and similarly for properties), every individual must be asserted to belong to at least one class (even if only owl:Thing), the URI’s used for classes, properties and individuals must be mutually disjoint. The details of these and other constraints on OWL DL and OWL Lite are explained in appendix E of the OWL Reference.

 

2 Protégé (knowledge-based applications with ontologies)

Protégé is a free, open-source platform that provides a growing user community with a suite of tools to construct domain models and knowledge-based applications with ontologies.

Reference: http://www.w3.org/TR/webont-req/#onto-defhttp://www.w3.org/TR/owl-features/http://protege.stanford.edu/

2014 in review

The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.

Here’s an excerpt:

The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 13,000 times in 2014. If it were a concert at Sydney Opera House, it would take about 5 sold-out performances for that many people to see it.

Click here to see the complete report.