Your browser (Internet Explorer 6) is out of date. It has known security flaws and may not display all features of this and other websites. Learn how to update your browser.

Ephesoft: Four Features of Practical Innovation

By Jake Karnes, ECM Consultant at Zia Consulting

With an eye to the future, Ephesoft continues to deliver practical innovation which improves the capabilities and usability of its core platform, Ephesoft Transact. Ephesoft demonstrates this commitment to current and future customers with four new features: cross-section extraction, automatic data conversion, paragraph extraction, and automatic regular expression suggestions and creations.

Ephesoft INNOVATE 2016 brought together leading minds to discuss the latest industry advances. Software companies face the persistent challenge of delivering practical innovation while staying true to their product’s role in a customer’s organization. Ephesoft tackles this problem with a two-pronged approach. Ephesoft remains on the cutting edge of document capture technology with their new big data analytics platform—Ephesoft Insight. Insight promises to extract content and meaning from documents scattered across an organization using machine learning and patented text-based analysis. In addition to pushing the envelope with Insight, Ephesoft is continuing to expand and strengthen Ephesoft Transact.

Feature 1: Cross-Section Extraction

Ephesoft Transact, formerly Ephesoft Enterprise, adds several powerful features in the upcoming 4.1 release with roots in customer feedback and provide out-of-the-box functionality which previously required customization. One such feature is cross-section extraction. This technique uses the intersection of two keys to find the correct value. In the example below, the two keys are “Services Borrower Did Not Shop For” and “Borrower-Paid” which meet at the value “$236.55.” This triangulation using multiple keys allows for the extraction of values which are ill-suited for existing extraction methods such as table extraction.


Feature 2: Automatic Data Conversion

Another feature which comes from business use cases is automatic data conversion. This feature allows extracted dates and other values to be automatically normalized to a standard format. For example, a date extracted as “MAR 21 2016” can automatically be converted to “03/15/2016” and vice versa. Other possible data conversions include predefined suffixes and prefixes, data replacement, upper or lower case conversion, and more. One novel use for this functionality would be to clean up imperfect OCR results. The extraction rules could be defined to allow for missing or erroneous characters, and the values could then be corrected during this data conversion step by removing or substituting the known, correct character(s).


Feature 3: Automatic Regular Expression Suggestion and Creation

Another example of Ephesoft’s dedication to improving user experience by expanding Transact’s functionality is the new, automatic regular expression suggestion and creation. Ephesoft has recognized the pain of writing regular expressions by hand, and helps minimize these efforts by suggesting regular expressions automatically. These suggestions are sourced from Ephesoft’s own library of common regular expressions, such as emails, dollar amounts, and dates. But Ephesoft can even help you create custom regular expressions based on the examples provided during extraction training. This strikes a powerful balance between the flexibility to write your own and the ease of having them automatically suggested or created for you. The usefulness of regular expressions is now unlocked without burdening the user with learning the complex regular expression notation. As an added bonus, this feature is already included in the latest release of Transact, and further information can be found at Ephesoft’s wiki page here or in a video demonstration below.

Feature 4: Paragraph Extraction

Paragraph extraction demonstrates Ephesoft Transact capabilities of mining valuable information from unstructured documents. This features enables the user to define values to be extracted from within larger bodies of text, without specific keywords or fixed locations. As an example, consider the following sections of a mortgage note:

Paragraph extraction can be used to extract each of the highlighted values. Even values which wrap around multiple lines (e.g. “Super Mortgage Inc”) can be handled with ease. Previously, this would have required custom scripting or a complex combination of different extraction techniques. Paragraph extraction allows the user to unlock information from their documents which may have been unused before.

These features indicate that Ephesoft’s innovation is not limited to their groundbreaking analytics platform. They continue to implement practical innovation which is equally important for new and existing customers. These features provide straightforward solutions to common pain points. By inviting and accepting feedback from their customers and partners, Ephesoft is pushing the capture industry forward on multiple fronts.


Jake Karnes – ECM Consultant Zia ConsultingJake Karnes is an ECM Consultant at Zia Consulting. He extends and integrates Ephesoft and Alfresco to create complete content solutions. In addition to client integrations, Jake has helped create Zia stand-alone solutions such as mobile applications, mortgage automation, and analytic tools. He’s always eager to discuss software to the finest details, you can find Jake on LinkedIn.

Tech Post: Extracting Metadata in Alfresco

Extracting Metadata in Alfresco

by Jeff Rosler, Solutions Architect at Zia

When importing files, each is uploaded with additional information including things like title, description, and text. Out of the box, Alfresco extracts the properties that have been mapped and metadata is taken from the content using Apache Tika. The TikaAutoMetadataExtracter class loads the supported mime types so all users have to do is create a bean that references that class and then set the properties desired in extraction.

The following are some simple samples for how metadata can be pulled from different mime types and set to Alfresco properties. Since Apache Tika is used as a basic metadata extractor in Alfresco, you can use that to extract metadata for all the mime types that it supports. The current version of Tika that Alfresco is using (for Alfresco and 5.1) is basically Tika 1.6 which supports the following file types. The TikaAutoMetadataExtracter class loads all the mime types that embedded version of Tika supports. So, all you need to do is to create a spring bean that references that class and set the properties to extract and set the Alfresco properties you’d like to have set. You don’t have to write any custom code.

Example 0 – Set logging to see what metadata can be extracted

Before defining your metadata extraction, it’s good to set your logging level for metadata extraction to DEBUG. When you do this, the extracted metadata for a file is shown in the log. This lets you correctly choose the embedded metadata property names to configure. You can set this by going to your file for the repo (alfresco) and adding the following line.

Restart alfresco and import a file. You should see something like this in the log. You can see properties with name spaces such as dc:title (the dc stands for dublin core, a metadata standard) as well as other properties that don’t contain a namespace. You can use these embedded properties to map to standard or custom Alfresco properties.

2016-02-03 10:03:49,474 DEBUG [content.metadata.AbstractMappingMetadataExtracter]
 [http-bio-8080-exec-10] Extracted Metadata from ContentAccessor[ 
 size=286436, encoding=UTF-8, locale=en_US]
 Found: {date=2016-01-22T18:59:00Z, Total-Time=1, extended-properties:AppVersion=14.0000,
 meta:paragraph-count=12, subject=beer, ipsum, meta:print-date=2016-01-22T18:59:00Z,
 Word-Count=405, meta:line-count=45, Manager=null, Template=Normal.dotm, Paragraph-Count=12,
 meta:character-count-with-spaces=2246, dc:title=Tom's Ipsum Beer, modified=2016-01-22T18:59:00Z,
 meta:author=Jeff Rosler, meta:creation-date=2015-12-31T15:49:00Z,
 Last-Printed=2016-01-22T18:59:00Z, extended-properties:Application=Microsoft Macintosh Word,
 author=Jeff Rosler, created=2015-12-31T15:49:00Z, Creation-Date=2015-12-31T15:49:00Z,
 Character-Count-With-Spaces=2246, Last-Author=Jeff Rosler, Character Count=1853, Page-Count=2,
 Application-Version=14.0000, extended-properties:Template=Normal.dotm, Author=Jeff Rosler,
 publisher=Zia Consulting, meta:page-count=2, cp:revision=4,
 Keywords=beer, ipsum, meta:word-count=405,
 dc:creator=Jeff Rosler, extended-properties:Company=Zia Consulting,
 description=beer, ipsum, dcterms:created=2015-12-31T15:49:00Z,
 Last-Modified=2016-01-22T18:59:00Z, dcterms:modified=2016-01-22T18:59:00Z,
 title=Tom's Ipsum Beer, Last-Save-Date=2016-01-22T18:59:00Z, meta:character-count=1853,
 Line-Count=45, meta:save-date=2016-01-22T18:59:00Z, Application-Name=Microsoft Macintosh Word,
 extended-properties:TotalTime=1, extended-properties:Manager=null,
 creator=Jeff Rosler, comments=null, dc:subject=beer, ipsum, meta:last-author=Jeff Rosler,
 xmpTPg:NPages=2, Revision-Number=4, meta:keyword=beer, ipsum, dc:publisher=Zia Consulting}

Example 1 – Set author, title, description

Specify your spring bean. You can name the id anything you want (that is a legitimate XML id) and point to the TikaAutoMetadataExtracter class (yes I know, that isn’t the way you spell Extractor, but the code has misspelled Extractor with an “e” instead of an “o”). In the code block below, we are overriding the default mapping and pointing to a separate property file. The properties could have been listed inline here, but pointing to the property files allows for easier editing.


<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns=""

   <bean id="" class="org.alfresco.repo.content.metadata.TikaAutoMetadataExtracter" parent="baseMetadataExtracter">
         <ref bean="tikaConfig"/>
      <property name="inheritDefaultMapping">
      <property name="mappingProperties">
         <bean class="org.springframework.beans.factory.config.PropertiesFactoryBean">
            <property name="location">


After specifying your spring bean that points to a properties file (e.g., within the properties file, set any Alfresco namespaces you’re specifying for the content model and then each property to be mapped. Note that during the extraction if you specify properties on aspects, those aspects will be applied to the content node automatically for you. Note that you put the embedded metadata property name on the left of the equal sign and the Alfresco property on the right. If you are specifying an embedded property that has a namespace prefix (e.g. dc:title) remember to escape the colon with a backslash (e.g. dc\:title). You don’t need to do that on the property value, just the property.


# Namespaces
# Mappings

Example 2 – Setting multiple Alfresco properties 

Embedded Metadata can be mapped to multiple Alfresco properties by specifying those properties as comma separated values. The example below shows setting the embedded author value to both cm:author and cm:description.


# Namespaces
# Mappings

Example 3 – Specifying when properties are extracted

The Metadata extractor has something called an OverwritePolicy. The OverwritePolicy specifies when an Alfresco property is overwritten. For example, you might not want your extractor to overwrite every time a new version is stored of a file as this would overwrite any of the mapped property values that were updated manually via Share or automatically through actions, workflows or other processes. Therefore, Alfresco defaults the OverwritePolicy to PRAGMATIC. This basically sets it to extract if the extracted property is not null  and the Alfresco property is not set or is empty.

However, if you want to change the behavior so that the extraction happens all the time (e.g. when content is updated), then you should set the OverwritePolicy to EAGER. This can be done by passing that as a parameter within your extractor bean as can be seen below.

<bean id="" class="org.alfresco.repo.content.metadata.TikaAutoMetadataExtracter" parent="baseMetadataExtracter">
      <ref bean="tikaConfig"/>
   <property name="inheritDefaultMapping">
   <property name="overwritePolicy">
   <property name="mappingProperties">
      <bean class="org.springframework.beans.factory.config.PropertiesFactoryBean">
         <property name="location">

Example 4 – Setting tags

Support for mapping tags was added in Alfresco 4.2.c. Details are mentioned in this blog post. You can easily add that to your extraction mapping. It just needs to be enabled in the extract-metadata bean and then the mapping set within your properties file.

NOTE: When setting tags, don’t do this while running from the Alfresco SDK using springloaded. Tagging won’t work and as soon as you try and import some content with tags (after you’ve made the updates below), your content will fail to load.

ALSO NOTE: I noticed in Alfresco 5.0 that the embedded keywords are getting concatenated into a single comma separated tag. This has been identified as a bug and a JIRA (MNT-15497) was created for fixing it. The fix was put in 5.0.4 and 5.1.1.

The following code block can be added to your spring bean xml config file to enable tagging.


    Override metadata extraction bean from action-services-context.xml to turn on the taggingService and enableStringTagging
    This will allow keywords to get mapped to tags.
<bean id="extract-metadata" class="org.alfresco.repo.action.executer.ContentMetadataExtracter" parent="action-executer">
  <property name="nodeService">
    <ref bean="NodeService" />
  <property name="contentService">
    <ref bean="ContentService" />
  <property name="dictionaryService">
    <ref bean="dictionaryService" />
  <property name="taggingService">
      <ref bean="TaggingService" />
  <property name="metadataExtracterRegistry">
    <ref bean="metadataExtracterRegistry" />
  <property name="applicableTypes">
  <property name="carryAspectProperties">
  <property name="enableStringTagging">

After tagging is enabled, just update your property file to map the appropriate embedded Keywords property to cm:taggable. The example below uses the embedded Keywords property.

# Namespaces
# Mappings


Metadata and Alfresco by Jeff Rosler, Solutions ArchitectJeff Rosler has more than 15 years’ experience architecting and developing enterprise content management solutions for customers across multiple verticals to help solve different business challenges. These solutions include digital asset management, component content management using XML, business process management, and web content management utilizing Alfresco and related standards, technologies, and products.

Webinar Series: Insurance in 2016 – Process Efficiency and Data Security

As we move into 2016, many insurance organizations are looking to transition from their end-of-year strategic planning towards the achievement of their set goals. Whether property and casualty, life insurance, reinsurance, or any other type, top priorities are improving process efficiency to deliver measurable business results and enhancing data security to mitigate enterprise risk. But, how can you get there? Our three-part webinar series shows how leveraging automation technologies can deliver a rapid ROI in document processing and also provide your organization with our Universal Content Security. Each webinar is 30 minutes in length and includes a demonstration and specific customer examples.

Webinar 1: Automating Claims Processing 
See how a leading insurance provider revolutionized their business using Intelligent Document Capture and a modern, integrated ECM content hub.

Watch Now!

Webinar 2: Automating Contracts and AP 
Learn to automate common back office business processes —from contracts management to AP—with tools like Document Assembly and enterprise integration into ERP systems such as SAP or Microsoft Dynamics.

Watch Now!

Webinar 3: Addressing Cyber Security
Discover how the application of data security can be automated across the entire organization and extended to external collaboration tools from email to Dropbox.

Watch Now!

Zia Lightning Talks: Alfresco Email Templates

Zia conducts monthly, internal lightning talk sessions. These short, five-minutes presentations cover topics important to Zia, our partners, and the industry. We’ve decided to start sharing some of these useful presentations with you. This post covers the talk presented by Lucas Patingre, ECM Architect at Zia Consulting.

Despite some companies trying to shut down internal emails, they are still frequently used as a notification tool. Right out-of-the box, Alfresco gives us the ability to send inline emails—written directly in the code—in both html and text formats, or to send the emails based on templates.

Benefits of Alfresco Templates

  • Separation of the view
    • Separating the presentation layer from the code makes it easier to work on the email structure or the dynamic content independently.
    • In the end, this will allow us to write and maintain more complex emais.
  • Localization
    • Alfresco supports multiple languages which manifests at several levels. The most obvious is the UI where all text is encapsulated within localized properties files to render based on user preferences.
    • For email templates, we can create one template file per language we want to support and then choose the right one at the time we send the email.
  • Edit online
    • The email templates are stored in the repository making it simple for an administrator to edit them without creating a new build of a customization.

How-To Use Alfresco Templates

  • Calling the action
    • Instead of passing raw text to the email action, you will pass the reference to a template node and a parameter map with the data to inject.


Action mail = actionService.createAction(MailActionExecuter.NAME);
mail.setParameterValue(MailActionExecuter.PARAM_SUBJECT, "Inline email subject");
mail.setParameterValue(MailActionExecuter.PARAM_TEXT, "Inline email body");


Map<String, Object> model = new HashMap<String, Object>();
Action mail = actionService.createAction(MailActionExecuter.NAME);
mail.setParameterValue(MailActionExecuter.PARAM_SUBJECT, "Templated email subject");
mail.setParameterValue(MailActionExecuter.PARAM_TEMPLATE, getEmailTemplate());
mail.setParameterValue(MailActionExecuter.PARAM_TEMPLATE_MODEL, (Serializable) model);
  • Bootstrapping the email templates
    • While not mandatory, it’s better to bootstrap the templates to the repository instead of uploading them manually. You should end up with one template per language you want to support, similar to this:


Note: The configuration used to bootstrap the templates is out of the scope of this blog post. You can find reliable resources online on how to bootstrap using the ImporterModuleComponent.

  • Localized email template fetching: getLocalizedSibling
      • The FileFolderService has an interesting method called getLocalizedSibling that can retrieve a localized (using the server’s locale) version of the template.
private NodeRef getEmailTemplate() {
    try {
        List<NodeRef> nodeRefs = searchService.selectNodes(
                nodeService.getRootNode(StoreRef.STORE_REF_WORKSPACE_SPACESSTORE), EMAIL_TEMPLATE_XPATH, null,
                nameSpaceService, false);
        if (nodeRefs.size() != 1) {
            logger.error("Cannot find the saved search notification email template: " + EMAIL_TEMPLATE_XPATH);
            return null;
        return fileFolderService.getLocalizedSibling(nodeRefs.get(0));
    } catch (SearcherException e) {
        logger.error("Cannot find the saved search notification email template: " + EMAIL_TEMPLATE_XPATH, e);
    return null;

Capabilities of FTL Templates

  • Basic variable injection
    • This is extracted from the default “Following” email template in Alfresco
    <#assign followerFullName>${followerFirstName} ${followerLastName}</#assign>
                    <img src="${shareUrl}/res/components/images/help-people-bw-64.png" />
                    <div>${(followerFullName?trim)?html} is now following you.</div>
                    <div><#if followerJobTitle??>${followerJobTitle?html}<br/></#if></div>
  • Freemarker logic
    • This is extracted from the default “Activities” email template in Alfresco. As activities can be one of several kinds and we want to format the notification differently for each kind, it requires more ftl logic. I have remove much of the actual display work to mainly keep the logic structures.
    <#if activities?exists && activities?size > 0>
        <#list activities as activity>
            <#if activity.siteNetwork??>
                <#assign firstVar="Something">
                <#assign otherVar=false>
                <#switch activity.activityType>
                    <#case "">
                    <#case "">
                        <#assign firstVar="Something else">
                    <#case "">
                        <#assign otherVar=true>
                <div class="activity">
                    <#if otherVar>${firstVar}<#else>Not var</#if>
  • Handle nodes
    • Another useful is the ability to fetch nodes and access their different properties. To use this, you will need to inject “companyhome” in your email template.
        <th>Modified date</th>
    <#list savedSearch.getNewResults() as savedSearchResult>
        <#assign savedSearchResultNode=companyhome.nodeByReference[savedSearchResult.toString()]>
            <td><a href="${shareUrl}/page/user/${}/profile">${}</a></td>
            <td><a href="${viewUrl}${['sys:node-uuid']}">${}</a></td>

From this, you can see the benefits of utilizing these templates to create a more streamlined deployment of your email communications. If you have any questions on how to best implement this approach, please contact us today.

Our Thoughts on the All-New Ephesoft Universe

Ephesoft UniverseToday the industry received word of an innovative new technology—Ephesoft Universe. Universe is a tool that matches enterprise Big Data with intelligent capture, allowing you to process and analyze all of the content contained in your organization’s documents, even if unstructured. I have had the pleasure of early access to the product for evaluation and want to share my thoughts with you. 

Universe allows business users to define different documents and data they would like to analyze in an easy and intuitive fashion. By automatically identifying large amounts of data fields like addresses, amounts, etc., users can tag the values with appropriate names used in the business (e.g., primary address, loan amount, SSN, employer). Universe then analyzes the values and determines the best ways to extract the data as more documents are processed. Once the definitions are set up, very large groups of documents can be processed in a very short time by utilizing Apache Spark clusters. This can be deployed on premise or in the cloud.  

It’s all well and good to rattle off some technical specs and usage but why should you, and I, care? While Ephesoft at Innovate 2015Ephesoft Enterprise for intelligent capture allows you to automate the intake of documents to drive business process and archival, it doesn’t harness the immense amount of data you already have in your enterprise. Imagine, as a mortgage insurance company, being able to set rates based on data you already have from the many closing packages you have processed. With Universe, this data is found by looking at defaults in various zip codes, size of homes, loan amounts, home values, and more. You can even look at the data you have and project it out for months. Consider a fraud detection company having the ability to determine how many loan applications a particular name or social security number has applied for in a specified amount of time. Many companies pay other entities for this information—data that already exists in documents they process in their enterprise.

We are excited about this innovative technology and the advanced solutions it will enable Zia to provide to our clients. Ephesoft Universe is going to save our customers time and money while lowering risk. Please contact us to discuss how this technology can help you.

– Pat Myers, EVP and Co-Founder of Zia Consulting

Demo – Adhere for Alfresco: Legal – Part 2

In this video, Sr. Solutions Engineer Jon Solove demonstrates the execution of a contract using Adhere for Alfresco: Legal, a content management solution from Zia Consulting.

For corporate legal departments within many organizations, the choice of a document management system has been limited to a small number of legacy vendors with complex and costly offerings that users are forced to accept, rather than working the way they want to work. The result is “ECM avoidance” with users finding ways around their ECM system–utilizing email, shared drives, or cloud technologies.

Today there is an alternative: Adhere for Alfresco: Legal, powered by Alfresco and delivered by Zia Consulting. Zia understands that when systems are easy-to-use and leverage existing tools like Office or Google Docs, the result is increased utilization and an improvement in control and compliance. Our Adhere for Alfresco: Legal solution delivers document and records management that’s as simple as using email or file systems, with the power of enterprise-class CMS features and functionality.


Video Demo – Adhere for Alfresco: Legal – Part 1

In this demo, Sr. Solutions Engineer Jon Solove demonstrates the creation of a case using Adhere for Alfresco: Legal, a content management solution from Zia Consulting.

For many corporate legal departments and AM100 law firms, the choice of a document management system has often been limited to a small number of legacy vendors with complex and costly offerings that users are forced to accept, rather than working the way they want to work. The result is “ECM avoidance” with users finding ways around their ECM system–utilizing email, shared drives, or cloud technologies.

Today there is an alternative: Adhere for Alfresco: Legal, powered by Alfresco and delivered by Zia Consulting. Zia understands that when systems are easy-to-use and leverage existing tools like Office or Google Docs, the result is increased utilization and an improvement in control and compliance. Our Adhere for Alfresco: Legal solution delivers document and records management that’s as simple as using email or file systems, with the power of enterprise-class CMS features and functionality.

From the Desk of Yoran: The ROI from Data-Centric Security

This is the second blog post in the series “From the Desk of Yoran”.

CovertixYoran Sirkis, CEO of Covertix, is a seasoned executive with more than 20 years of experience in information security, specializing in data and physical risk management. He is also a frequent speaker at leading industry conferences.


The ROI from Data-Centric Security

When was the last time you counted the number of security tools in your organization? How many different vendors are involved? What are the maintenance and licensing costs?

I bet you lost count… anyone would.

Companies strive to define and apply security rules that will best protect data, based on their specific business needs. During that process, IT staff encounter evolving security needs and are exposed to an endless amount of solutions—each addressing a valid, real-world security challenge.

As a part of the process of protecting their enterprise data, organizations end up dealing with a variety of vendors, pricey integrations, busy helpdesks, frustrated users and, continuously increasing expenses.

Doing More with Less

As I mentioned in my previous post, The Need for Data-Centric Security, the age of data-centric security transforms the security focus from top-down to bottom-up. It’s this reasoning that makes a data-centric solution more valuable than what it was originally implemented for.

Guided by the understanding and importance of offering a security solution that presents a clear ROI, we have developed a single data-centric solution that delivers much more than file protection.

Data Classification

How many unstructured data files do you think sit on your corporate network? Take a guess. It’s a scary thought, isn’t it? Before you make your team (and yourself) insane by protecting every single file individually, we recommend you take the data classification approach—that’s right, data-centric.

You need a system that easily lets you implement policies across data document types, such as CAD design files from R&D, to files in the accounting department that contain the number 4128 at the start of a 16-digit number.

File Protection 

Having a file protection solution in your organization is critical, but most solutions provide limited protection. A data-centric security solution brings much more to the table. It enables organizations to protect, manage, and audit files internally AND externally, to share sensitive information with external users, and to protect information on different devices.

File Encryption

Encrypting files is necessary to ensure your data is protected and is used only by the people it was intended for. Most solutions burden users, forcing them to learn a new system and placing all the responsibility on them.

A data-centric solution removes that burden by offering a system that operates seamlessly and without affecting users’ behavior. This secures your data in all of the following cases:

  • After the file has been opened (using any device or location)
  • When content is copied/pasted to a new document
  • Protection of the file’s metadata
  • When sensitive files are shared with external users

Your files are protected, and you can audit and monitor the usage of their content no matter where the files actually reside—inside or outside the organization.

Secure Vaults

Confidential data is often placed within secure vaults. But even the best vaults will only keep your data secure when it is stored within it. A data-centric solution provides persistent security, keeping your data secure anytime and anywhere. From the moment a confidential document is created, through any transport, and even when it is download on any device.

As it is transparently integrated into existing business driven processes with automated rules or manual override, a data-centric security solution will not impose on IT staff and is not dependent on a user’s actions.

Cloud Security

It’s a given that assets residing in the cloud need to be protected. Because of this, cloud providers began offering their own security solutions as well as those from third parties. Of course, the costs begin to pile up and companies are often still uncertain about who else might have access to their cloud-based data.

Deploying a smart, data-centric security solution is the best way to protect your data anywhere—and even from cloud providers themselves.

Data Leak Prevention (DLP)

DLP solutions aim to prevent files from leaving the your business unintentionally or through malicious actions. But how do you protect your data if it is leaked? A data-centric security solution continues to monitor your files even outside the organization and ensures the data they contain remains secure.

And most importantly….

While plenty of solutions secure your data at rest, or in motion, or when it goes to third parties, but only a data-centric solution can secure the file structure and the data it contains so you know that data is always protected.

If you have concerns about your confidential data when it’s in motion, at rest, or in use; and whether it could be lost through a data breach, from a stolen device, or other unintentional or malicious way,  we can give you peace of mind. You CAN have a system with a strong ROI, because you won’t find yourself facing lawsuits or losing customers and your reputation.

For more information, please visit

Alfresco Development—Past, Present, and Future


by Bindu Wavell, Chief Architect at Zia Consulting

There are two reasons I decided to write this post. First, I want to acknowledge Alfresco for their recent investments in the developer ecosystem. The other reason is to explain where I think we are heading with our development efforts. My ulterior motive is to find people to collaborate with us on these efforts.

Since Thomas DeMeo joined Alfresco as VP of Product Management a bit over a year ago, we’ve noticed a dramatic increase in the focus on system integrators being key stakeholders for Alfresco—and not just based on expertise in sales and business development. After the release of Alfresco One 5.0 at the Alfresco Summit, we saw the likes of Peter Monks and Gabriele Columbro tapped to bring focus to user stories that are important for administrators and developers within the product management organization. Recently, Richard Esplin transitioned from the community lead to focusing on the Community Edition within product management. Alfresco hired Martin Bergljung and Ole Hejlskov to focus on developer tooling/evangelism and community outreach. Within weeks of starting, these individuals put together a new release of the SDK; incorporating contributions, adding new capabilities, and completely revamping the documentation. I’m thrilled that Alfresco is focusing resources in these areas because I think we will see resolution of a lot of technical debt—and that allows for better solutions in less time, leading to a bigger and more vibrant community.

In the past year or so, Alfresco engineering has begun to reorganize into smaller, more agile, scrum teams. This reorganization—along with the focus on product management—will drive initiatives like release agility to provide more frequent and better tested releases of distinct products. It should also provide a platform for resolving technical debts in a more sustained and predictable fashion than we’ve seen in the past. We can also expect cool new products that are easier to integrate and customize. Things like Activiti Enterprise—the integration between Activiti Enterprise and Share—enhanced Office services, reporting/analytics, media management, and even new Case Management features. Not to mention, significant improvements in the repository, Share, and records management.

As the Chief Architect at Zia, part of my mission is to facilitate improvements in developer productivity and satisfaction. In addition, I want to help the team find ways to improve project quality and consistency. I’d like to share where we are heading in these areas, but first let’s cover where we’ve been.

In the past year or so, most of our projects have been based on the third major revision of our development framework. We call the framework—the project structure and the associated tooling—Zia Alfresco Quickstart (for more information, watch this video). Quickstart includes a standard project structure that we evolved from the all-in-one archetype provided by the Alfresco 1.0 Maven SDK. It features reusable code, examples, best practices, and, to some extent, standardizes how we version and deliver our projects and reusable sub-projects.

With version 1.0 of the SDK, as well as our earlier project structures, we were seeing cycle times (from the point when we saved our code to when we were able to exercise the code) of between two and five minutes on very powerful laptops with lots of RAM and solid-state disks. One of the main reasons we started evolving the SDK was to reduce this cycle time. When we started using Quickstart for customer projects, we were able to reduce the cycle time for most edits to about 10 seconds. We did this by taking advantage of incremental compilation and hot deployment techniques. If I was writing this post a couple of years ago, it would have been all about flow. It was hard to experience flow when you had time for tea and a bagel after most code/config changes. Fortunately, this is not as much of an issue anymore. The Alfresco 1.1.x SDK made some similar techniques available for the wider developer community. With the 2.0 SDK, this has been improved even more—but there’s still work to be done.

One area where Quickstart enhances the SDK capabilities is an integration testing framework for repository customizations that also supports continuous integration and, to some extent, delivery. After we presented this framework during Tech Talk Live #69 (see video above), the 1.1 SDK added a similar capability—however, that solution has been a bit unreliable. We contributed the Quickstart testing framework to the SDK team and are hopeful it will be incorporated in the near future. We are excited that the 2.1.0 version includes support for the Share Page Object testing extensions to Selenium WebDriver that was, and continues to be, developed by the Share engineering. This will make it much easier to create tests for UI customizations and to make sure our customizations don’t unexpectedly break existing capabilities provided with the products.

With the project structure we used before Quickstart, it often took us between four and eight hours to get a full development environment (just the Alfresco pieces) installed and configured. With Quickstart, we’ve reduced this to around two or three hours.

We often need to work on code for multiple projects in any given week. In order to handle this, and to accommodate customer variations, we usually set up our development environments in Virtual Machines (VM). Nearly every time we’ve had to start from a base OS machine.

Typically, one team member sets up the initial VM, installs the development tools, and sets up the project structure. Then the VM is shared with all of the team members. We make heavy use of VM snapshots and usually someone keeps a pristine copy of the VM that tracks releases. Should a new developer join the project, or an upgrade be performed, we utilize this pristine copy. Often these VMs are over 40GB, requiring a substantial amount of time just to copy the data.

At Zia, we’ve been testing a few different code review approaches. Some projects are doing regular reviews (weekly for example), others are focusing on reviewing each new significant feature. The ability to create pull requests from forks and branches in BitBucket and GitHub has provided enough of a framework for us so far—though we’d love to incorporate more tooling around code quality and coverage to provide consistent feedback to users.

The Path Ahead

Quickstart has been seen as a proprietary solution that allows us to complete projects faster and at lower cost than we were able to previously. One of the downsides to it being proprietary is that there is a smaller community for collaboration and support of the approach. The next version of our project structure is being developed in-the-open using the open source model we admire so much.

The Quickstart project structure is quite different than any of the official SDKs, and there are good reasons for the differences. In many cases, they improve on what is available in the community today. However, what we have is different enough from the standard that new team members often have a steep learning curve to become proficient and ultimately master the structure. So, while a seasoned practitioner will be very productive, newer folks require more time and support to become productive. This turns out to be detrimental to the goal of improving productivity for some team members.

With the next generation project structure, we plan to stay closer to the official SDK so that there is a much larger community for collaboration. While we still plan to include support for certain opinionated features, we will also support and default to using more traditional Alfresco implementation approaches. Our hope is that this change in direction will facilitate quicker onboarding and allow SDK and Alfresco upgrades to be handled more expeditiously.

With Quickstart and all of the Alfresco SDKs to date, we have to duplicate a large portion of the boilerplate code for common Alfresco customizations such as web scripts, actions, behaviors, jobs, and workflows via cut and paste. While most of these aren’t difficult, they do tend to be error prone.

Our new approach is to provide a Yeoman project generator. This automates the construction of a project using the all-in-one archetype from the SDK while adding a few bells and whistles for improved productivity. Though it’s still in its infancy, this part of the project is available now and we are using it for customer projects when appropriate. In addition, we are working on sub-generators for common and boilerplate things like: adding amps (source, third party from git, third party from Maven, and checked into the project), adding webscripts, and adding actions. We also plan to work on sub-generators for adding behaviors, models, workflows, jobs, javascript extensions, and data bootstrapping. We may even create generators for common tasks like switching from h2 to a real database, enabling ldap, hooking up the standard image processing tools, and other common tasks that collaborators think will add significant value during the development phase of Alfresco implementation projects.

The development VMs we’ve been using are difficult to version control, slow to copy, and frankly, take significant CPU, disk, and memory resources that we’d prefer to allocate to development and runtime tasks.

We’ve been toying with setting up our development environments using devops tools such as Ansible, Chef, VMWare, Vagrant, and Docker. Using Docker, we have been able to spin up and exercise clustered Alfresco environments on a single machine for testing and POC activities. We’ve also used Vagrant and Ansible to get about a 40% head start on our development VMs. The hope is to script 90% of the project setup efforts, to reduce project setup time, and increase consistency between our projects. We also hope to utilize Docker or other lightweight container solutions to reduce the overhead of our environments.

To date, we’ve had mixed success using these tools to setup our development environments. It often takes a significant amount of time to create and refine the devops scripts and we don’t expect to see a return on our investment until we’ve utilized and stabilized these tools with a number of projects. Fortunately, we have worked with a few customers to create production quality release/delivery substrates using these tools. Our hope is to incorporate our experiences from these projects into the developer tooling with an eye toward standardizing how we install and configure Alfresco solutions in all environments. We feel that by utilizing these techniques, developers will be able to rebuild small, containerized environments from scratch when needed, rather than maintaining and sharing monolithic VMs. This approach will be much easier to version control, easier to upgrade, easier to share, and will be lighter on resources.

An area we are also exploring is the use of cloud development infrastructure (e.g. Codenvy) to develop, run, and test our projects. We’d like to utilize our devops work and create containers that we can use during development and testing and potentially as a vehicle for delivering projects as well. It would be great if this allowed additional interactivity and collaboration during code reviews, while fixing bugs, and for training/coaching users one-on-one or in groups. We’d also like to reduce expenditures on hardware for developers and to deliver progressive capacity to our engineering organization. The ultimate goal is to work smarter with our in-house, remote, and offshore team members.

Our first usable effort in the area of cloud development is the contribution workflow for our new Yeoman generator. By clicking a button on the project GitHub page, we can provision a development environment that has access to the project source code and a docker container that has been set up with the appropriate versions of Java, Maven, Node, and Yeoman. It would also have the local (in the container) Maven repository pre-seeded with assets needed for compiling and running the Alfresco projects we build while testing out the generators. Someone wishing to make a contribution can start developing and testing in under a minute and can send us a pull-request directly from the generated project on Codenvy.

View a generator contribution demo video here

We’d like to invite you to collaborate on these ideas and deliverables. Currently, we are focused on completing our first pass on the Yeoman generator and some high value sub-generators. We’d love to collaborate in order to continue evolving the developer/implementer experience for Alfresco extensions. If you are interested, please leave a comment, send an email, or ping me here on IRC. Once the generator is in good shape, we’ll likely set up a cloud-based development experience. This will be driven by the generators and backed by pre-packaged containers that can be used in the cloud, on our development machines, and possibly in customer dev, stage, and prod environments. Imagine quickly packaging your configuration and customizations with Alfresco into an all-dependencies included container. You could then run tests against the container, deploy that tested container to stage, perform UAT and—assuming everything is accepted—promote the exact same (tested and accepted) container to production.

Now that’s the future of Alfresco development.

Zia’s Simple + Secure Back Office: AP Automation

For many organizations, their most critical corporate documents are associated with “back office” corporate functions. From Employee Records to M&A Documents, this content must be shared and stored securely, and content management systems must be integrated with business processes and business applications. To address this need, Zia Consulting has created the Simple + Secure Back Office, providing solutions for multiple departments including accounts payable.

Join Senior Solution Engineer Jon Solove for a quick demonstration of Zia’s AP Automation Solution, leveraging Ephesoft’s Intelligent Document Capture technology.