Monday, December 18, 2023

The State of Generative AI: A Survey of 2023 and Outlook for 2024

Introduction

The year 2023 witnessed significant advancements in the field of Artificial Intelligence (AI), particularly in the realm of generative AI. This article provides an in-depth survey of the key trends in AI development in 2023 and offers insights into the outlook for 2024, shedding light on the transformative effects of generative AI.

Key Trends in AI Development in 2023

Generative AI Advancements The year 2023 marked a period of remarkable progress in generative AI, with substantial implications for industries and workforces. The State of AI in 2023 report by McKinsey highlighted the transformative effects of generative AI, emphasizing its potential to revolutionize various sectors. OpenAI's GPT-4 emerged as a groundbreaking generative AI model, revolutionizing natural language processing and creative content generation. The explosive growth of generative AI (gen AI) tools was confirmed by the latest annual McKinsey Global Survey, with one-third of survey respondents using gen AI regularly in at least one business function.

Multi-modal Learning Another notable trend in AI development was the emergence of multi-modal learning, which brought forth new possibilities for AI models and their applications. This approach enabled AI systems to process and understand information from multiple modalities, leading to enhanced performance and versatility.

Economic Impact of AI The economic impact of AI became increasingly pronounced in 2023, influencing global economies and driving discussions on the future of work and productivity. AI could contribute up to $15.7 trillion to the global economy in 2030, more than the current output of China and India combined. Of this, $6.6 trillion is likely to come from increased productivity and $9.1 trillion is likely to come from consumption-side effects.

Outlook for 2024

Monumental Leaps in AI Capabilities The outlook for 2024 foresees monumental leaps in AI capabilities, particularly in areas demanding complex problem-solving, fueled by quantum advancements. These advancements are expected to redefine the boundaries of AI applications and unlock new frontiers in technology.

Rapid Transformations Across Industries 2024 is poised to witness rapid transformations across industries, with generative AI expected to play a pivotal role in reshaping business operations and driving regulatory discourse. This paper examines the real-world application of AI in multiple sectors, including healthcare, finance, agriculture, retail, energy, and automotive. Several case studies are described to illustrate the impact of AI on these industries. The AI Dossier highlights the most compelling use cases of AI in six major industries, providing insights into the practical applications of AI across various sectors.

Challenges and Potential Failures in Generative AI Initiatives Despite the promising outlook, 2024 also brings forth challenges and potential failures in generative AI initiatives. Predictions suggest that the majority of such initiatives may face obstacles and encounter failures, underscoring the complexities and uncertainties associated with generative AI.

Industry Outlook for Generative AI The industry outlook for generative AI reflects early promise, potential payoffs, and uncertainties. AI is helping organizations in the energy, resources, and industrials industry to rapidly innovate, reduce their climate impact, and increase business productivity. The impact of AI on a hospitality company has been studied, providing insights into the transformative effects of AI in the hospitality sector.

Impact of AI on Worldwide IT Spending Projections for 2024 indicate a significant impact of AI on worldwide IT spending, with expectations of substantial growth driven by the integration of AI technologies across various sectors. The influence of AI on global IT spending is set to shape the landscape of technology investments and strategies.

Conclusion

The year 2023 marked a pivotal phase in the evolution of generative AI, setting the stage for transformative developments in the year ahead. As we look towards 2024, the landscape of AI is poised for monumental leaps, rapid transformations, and the navigation of challenges and uncertainties. The journey of generative AI continues to unfold, shaping the future of technology and innovation on a global scale.

Thursday, December 07, 2023

Gemini: A New Family of Multimodal Models

Have you ever wondered what it would be like to have a model that can understand and reason across different types of data, such as text, images, audio, and video? Well, wonder no more, because Google has just introduced Gemini, a new family of multimodal models that can do just that!

Gemini models are trained on a large and diverse dataset of image, audio, video, and text data, and can handle a wide range of tasks, such as summarizing web pages, translating speech, generating images, and answering questions. Gemini models come in three sizes: Ultra, Pro, and Nano, each designed for different applications and scenarios.

In this blog post, we will give you an overview of the Gemini model family, its capabilities, and some of the exciting use cases it enables. We will also discuss how Google is deploying Gemini models responsibly and ethically, and what are the implications and limitations of this technology.

What is Gemini?

Gemini is a family of highly capable multimodal models developed at Google. Gemini models are based on Transformer decoders, which are enhanced with improvements in architecture and model optimization to enable stable training at scale and optimized inference on Google’s Tensor Processing Units.

Gemini models can accommodate textual input interleaved with a wide variety of audio and visual inputs, such as natural images, charts, screenshots, PDFs, and videos, and they can produce text and image outputs. Gemini models can also directly ingest audio signals at 16kHz from Universal Speech Model (USM) features, enabling them to capture nuances that are typically lost when the audio is naively mapped to a text input.

Gemini models are trained jointly across image, audio, video, and text data, with the goal of building a model with both strong generalist capabilities across modalities and cutting-edge understanding and reasoning performance in each respective domain.

What can Gemini do?

Gemini models can perform a variety of tasks across different modalities, such as:

Text understanding and generation: Gemini models can understand natural language and generate fluent and coherent text for various purposes, such as summarization, question answering, instruction following, essay writing, code generation, and more. Gemini models can also handle multilingual and cross-lingual tasks, such as translation, transcription, and transliteration.
Image understanding and generation: Gemini models can understand natural images and generate captions, descriptions, questions, and answers about them. Gemini models can also generate images from text or image prompts, such as creating logos, memes, illustrations, and more. Gemini models can also handle complex image types, such as charts, diagrams, and handwritten notes, and reason about them.
Audio understanding and generation: Gemini models can understand audio signals and generate transcripts, translations, summaries, and questions and answers about them. Gemini models can also generate audio from text or audio prompts, such as synthesizing speech, music, sound effects, and more. Gemini models can also handle different audio types, such as speech, music, and environmental sounds, and reason about them.
Video understanding and generation: Gemini models can understand videos and generate captions, descriptions, questions, and answers about them. Gemini models can also generate videos from text or video prompts, such as creating animations, clips, trailers, and more. Gemini models can also handle different video types, such as movies, documentaries, lectures, and tutorials, and reason about them.
Multimodal understanding and generation: Gemini models can understand and reason across different types of data, such as text, images, audio, and video, and generate multimodal outputs, such as image-text, audio-text, video-text, and image-audio. Gemini models can also handle complex multimodal tasks, such as verifying solutions to math problems, designing web apps, creating educational content, and more.

Evaluation

Gemini models have achieved state-of-the-art results on various benchmarks and tasks, demonstrating their strong performance and generalization capabilities. For example, Gemini models have achieved the following results:

Multimodal Machine Learning Understanding (MMLU): Gemini models have achieved the highest score on the MMLU benchmark, which measures the ability of models to perform a wide range of natural language understanding tasks across multiple modalities and languages. Gemini models have outperformed other models by a large margin, especially on tasks that require cross-modal reasoning and inference.
Multimodal Machine Learning for Multilingual Understanding (MMMU): Gemini models have achieved the highest score on the MMMU benchmark, which measures the ability of models to perform multilingual natural language understanding tasks across multiple modalities and languages. Gemini models have outperformed other models by a large margin, especially on tasks that require cross-lingual and cross-modal reasoning and inference.
ChartQA: Gemini models have achieved the highest score on the ChartQA benchmark, which measures the ability of models to answer questions about charts and graphs. Gemini models have outperformed other models by a large margin, especially on tasks that require complex reasoning and inference.
CoVoST 2: Gemini models have achieved the highest score on the CoVoST 2 benchmark, which measures the ability of models to perform simultaneous translation of speech and text across multiple languages. Gemini models have outperformed other models by a large margin, especially on tasks that require cross-modal and cross-lingual reasoning and inference.

These results demonstrate the impressive performance and potential of Gemini models for various applications and domains.

Applications

Gemini models have many potential use cases and benefits for various domains and users, such as:

Education: Gemini models can help students and teachers to learn and teach more effectively and efficiently, by providing personalized feedback, generating educational content, and facilitating communication across languages and modalities. For example, Gemini models can help students to solve math problems, learn new languages, and create multimedia projects.
Creativity: Gemini models can help artists and designers to create and explore new ideas and styles, by generating images, music, and videos based on their prompts and preferences. For example, Gemini models can help designers to create logos, posters, and websites, and help musicians to compose songs and soundtracks.
Communication: Gemini models can help people to communicate and collaborate more effectively and inclusively, by providing real-time translation, transcription, and summarization of speech and text across languages and modalities. For example, Gemini models can help business people to negotiate deals, scientists to share research findings, and activists to raise awareness about social issues.
Information Extraction: Gemini models can help researchers and analysts to extract and summarize relevant information from large and complex datasets, by processing text, images, audio, and video data. For example, Gemini models can help journalists to investigate news stories, analysts to predict market trends, and doctors to diagnose diseases.
Problem Solving: Gemini models can help people to solve complex problems and make informed decisions, by providing accurate and reliable information, and reasoning across different modalities and domains. For example, Gemini models can help engineers to design new products, lawyers to argue cases, and politicians to make policies.

These applications demonstrate the versatility and potential impact of Gemini models for various users and domains.

Limitations and Challenges

Gemini models also have some limitations and challenges that need to be addressed and mitigated, such as:

Factuality: Gemini models may generate outputs that are not factually accurate or reliable, especially when dealing with complex or ambiguous information. Gemini models may also generate outputs that are biased or offensive, especially when trained on biased or offensive data. To mitigate these issues, Gemini models are developed and evaluated using rigorous and diverse datasets and metrics, and are subject to human review and feedback.
Hallucination: Gemini models may generate outputs that are not coherent or meaningful, especially when dealing with rare or unseen data. Gemini models may also generate outputs that are unrealistic or inappropriate, especially when trained on unrealistic or inappropriate data. To mitigate these issues, Gemini models are trained and evaluated using diverse and realistic datasets and metrics, and are subject to human review and feedback.
Safety: Gemini models may generate outputs that are harmful or dangerous, especially when dealing with sensitive or confidential data. Gemini models may also generate outputs that violate privacy or security, especially when trained on private or sensitive data. To mitigate these issues, Gemini models are developed and deployed following a structured approach to impact assessment, model policies, evaluations, and mitigations, and are subject to legal and ethical compliance.
Ethics: Gemini models may generate outputs that perpetuate or amplify social biases or stereotypes, especially when trained on biased or stereotypical data. Gemini models may also generate outputs that violate cultural norms or values, especially when trained on data from different cultures or contexts. To mitigate these issues, Gemini models are developed and evaluated using diverse and representative datasets and metrics, and are subject to human review and feedback. Gemini models are also designed to be transparent and interpretable, so that users can understand how they work and what they learn from the data.

Gemini is a new model. Comparing to other large language models such as GPT-4, which have already been deployed and used in many applications. Given the prompts of LLMs can be quite different, Gemini will face competition from other LLMs, despite its state-of-the-art benchmark performance. However, Gemini has some unique features and advantages that distinguish it from other LLMs, such as its multimodal capabilities, its superior computing power, and its advanced training and optimization techniques. It will be interesting to see how Gemini will evolve and compete with other LLMs in the future, and how it will benefit humanity in various ways.

Sunday, November 05, 2023

The Next Evolution of AI: Introducing Self-Reflecting Multi-Agent RAG Systems

Artificial intelligence systems have seen remarkable advances in recent years through innovations like large language models and multi-modal architectures. However, critical weaknesses around reasoning, factual grounding, and self-improvement remain barriers to robust performance. An emerging approach combining collaborative multi-agent interaction, retrieval of external knowledge, and recursive self-reflection holds great promise for overcoming these limitations and ushering in a new evolution of AI capabilities.

Multi-agent architectures draw inspiration from the collaborative intelligence exhibited by human groups and natural systems. By dividing cognitive labor across specialized agents, strengths are amplified and limitations mitigated through symbiotic coordination. Retrieval augmented systems fuse this collective competence with fast access to external databases, texts, and real-time data - providing vital context missing within isolated models today.

Building recursive self-reflection into both individual agents and the system as a whole closes the loop, enabling continuous optimization of reasoning processes in response to insights gained through experience. This capacity for introspection and deliberative thinking allows the system to operate in a more System 2 thinking oriented manner.

For example, by assigning distinct agents specialized roles in critiquing assertions made by peer agents, the system essentially institutes deliberate verification procedures before accepting conclusions. Facts retrieved from external sources can validate or invalidate the logic applied, promoting evidence-based rational thinking. And recursive self-questioning provides a mechanism for critical analysis of internal beliefs and assumptions that overrides blindspots.

Together, these three mechanisms offer the foundations for AI systems capable of open-ended self-improvement through rigorous reasoning and abstraction. Peer critiquing across diverse agents identifies gaps or contradictions in thinking. Real-world data grounds beliefs in empirical facts. And recursive self-questioning pressure tests the integrity of all assertions made. Through such recurring examination, vulnerabilities are uncovered and addressed, incrementally strengthening the system's intelligence.

While substantial research is still needed to effectively integrate these capabilities, the promise is profound. Such an architecture could greatly augment AI aptitude for deep domain expertise, creative speculation, and nuanced judgment - hallmarks of advanced general intelligence. The result is systems exhibiting the flexible cognition, contextual reasoning, and capacity for lifelong learning characteristic of human minds but at machine scale.

Wednesday, November 01, 2023

FLOW Pattern: Comprehensive Knowledge Extraction with Large Language Models

Intent:

The FLOW pattern aims to enable the systematic extraction, organization, and synthesis of comprehensive knowledge from large documents or transcripts using large language models, such as GPT-3. It addresses the limitation of these models' context window, allowing them to handle large documents by chunking them up in the different stages of the pattern.

Motivation:

Large language models often have a limited context window, which restricts their ability to ingest and analyze complete large documents at once. The FLOW pattern seeks to overcome this limitation by breaking down the document into manageable chunks, enabling a more comprehensive analysis and synthesis of information.

Implementation:

Find: Extract specific topics or perspectives from the document or transcript in manageable chunks, considering the limitation of the model's context window.
Link: Synthesize the extracted content from various chunks of the document or transcript, ensuring coherence and connectivity between different elements, while noting that the information comes from different parts of the source material.
Organize: Structure the linked content in a logical and coherent manner, facilitating a systematic analysis of the document's core concepts and themes using the model's iterative capabilities.
Write: Expand and elaborate on the organized content, producing a well-developed document that is informative, engaging, and accessible to the target audience, leveraging the model's text generation capabilities.

Consequences:

By implementing the FLOW pattern with large language models, organizations can effectively overcome the limitations of the context window, enabling a more comprehensive analysis of large documents and transcripts. This, in turn, enhances the model's ability to provide meaningful insights and make informed decisions based on a more holistic understanding of the underlying information.

Example Use Case:

An organization utilizes the FLOW pattern with a large language model like GPT-3 to create comprehensive meeting minutes from a transcript, effectively overcoming the context window limitation by chunking the document into manageable sections for analysis and synthesis.

Summary:

The FLOW pattern, when implemented with large language models, serves as an effective approach for comprehensive knowledge extraction from large documents or transcripts. By addressing the context window limitation, it enables organizations to derive meaningful insights and make informed decisions based on a more comprehensive understanding of the underlying information.

Thursday, January 26, 2023

Transforming Document Exploration: Using GPT-3 to Create a Chatbot that Answers Your Questions

GPT-3 is the underlying model of ChatGPT. It is an advanced language model that can generate text but it can't find information that is not already in the model. In additon, there are limitation to feed long text to GPT-3. In this article, we will show you how to use a technique called conditional generation to build a chatbot that can help you explore long documents.

Conditional generation is a way of training a language model to understand the context of a document and generate answers to questions about it. We will take the following steps to build the chatbot:

Create a corpus from the document.
Match the user's question to the relavant passages in the corpus.
Create a prompt that includes the context of the question.
Send the prompt to GPT-3 to generate an answer.
Create a function that combines steps 2-4.
Test the chatbot by creating a question-answer loop.
Use Gradio to create an interface for the chatbot.

In this exercise, we will build a chatbot to answer the question in an employee handbook.

Create a corpus from the document

First we have prepared a file HKIHRMEmployeeHandbook.txt which is text version of a sample employee handbook created by Hong Kong Institute of Human Resource Management.

To start, we are using a python library called "sentence-transformers" to create a corpus of embeddings from the text document "HKIHRMEmployeeHandbook.txt". The sentence-transformers library is an NLP library that makes use of pre-trained transformer-based models, like BERT, RoBERTa, DistilBERT, and, in this case, multi-qa-mpnet-base-cos-v1, to generate embeddings for each sentence in the corpus. This is useful for our chatbot application as it allows us to find the most relevant passage in our document for the user's query. This will help to provide more accurate and useful answers for the user.

Embeddings are a way to represent text in a numerical form that can be used for comparison and similarity calculation by the model. In this case, we are using sentence-transformers library to convert the text into embeddings. These embeddings are a compact and dense representation of the text, which can be used to compare and measure similarity between different texts.

There are several advantages of using embeddings instead of keyword search when building a chatbot for document exploration:

Semantic Similarity: Embeddings capture the semantic similarity between sentences, which allows for more accurate and relevant matches between the user's query and the passages in the corpus.
Handling synonyms and variations: Embeddings can handle synonyms and variations in the query, which are difficult to handle with traditional keyword search. For example, if the user asks "what are the benefits of working overtime?" and the document talks about "compensation for working beyond regular hours", the embeddings will be able to match the query with the relevant passage, even though the words in the query and document are different.
Handling context: Embeddings can also capture the context in which a word is used, which allows for more accurate matches between the query and the passages in the corpus. For example, "what are the benefits of working overtime?" and "what are the drawbacks of working overtime?" have similar keywords but opposite meaning, embeddings can help to understand the context and give the correct answer.
Scalability: Embeddings can handle large datasets and can be used to match queries against a large corpus of documents efficiently.

Overall, using embeddings instead of keyword search improves the accuracy and relevance of the chatbot's answers, and provides better user experience.

Match the user's question to the relavant passages in the corpus

Once we have the embeddings, we can find the most similar passages in the document to the given query. The util.semantic_search method to find the most similar passages in the corpus based on the query embedding and the corpus embeddings. The top_k variable controls how many passages to retrieve. After getting the passages from the corpus, they are joined as a whole piece together with linebreak to form the context. The result is as follows:

2.2 Working Hours 3.2 Overtime Compensation Prior approval must be obtained from respective supervisor for working overtime. Overtime for employees at grade 7-10 will be compensated by pay at the following rates: Category 1 a) For non-shift employees, Monday to Friday starting from an hour after normal working hour to 12:00 midnight and Saturday after 2:00 p.m. b) For shift employees, overtime worked beyond his shift 1.5 times hourly rate Overtime for employees at grade 4-6 will be compensated by way of time off. Such compensation leave will only be granted when their department’ s workload permits. Any overtime not compensated in the form of leave by the end of the calendar year will be paid in the following February at the rate of 1.5 times the employee's hourly rate. 3.3 Annual Bonus Employees who work overtime in the following conditions are entitled to claim for meal allowance: - overtime for 3 consecutive hours from Monday to Saturday; - overtime for 6 consecutive hours on Sunday or public holidays Employees who are required to report for duty outside office hours in case of emergency will be entitled to an emergency allowance and compensation leave 4.5 Compensation Leave or Time-off for Overtime Work Employees at grade 4-6 who have worked overtime will be compensated by way of time off. Such compensation leave will only be granted when their department’ s workload permits. Please also refer to “Overtime Compensation” under Section 3.2 of this handbook for details.

You can observed that all passages retrieved are more or less relavant to the overtime question.

Create a prompt that includes the context of the question

Then, we can creating a prompt template. The prompt will be created and send for GPT-3 to generate an answer for the query based on the context. The following prompt will be created

Context: 2.2 Working Hours 3.2 Overtime Compensation Prior approval must be obtained from respective supervisor for working overtime. Overtime for employees at grade 7-10 will be compensated by pay at the following rates: Category 1 a) For non-shift employees, Monday to Friday starting from an hour after normal working hour to 12:00 midnight and Saturday after 2:00 p.m. b) For shift employees, overtime worked beyond his shift 1.5 times hourly rate Overtime for employees at grade 4-6 will be compensated by way of time off. Such compensation leave will only be granted when their department’ s workload permits. Any overtime not compensated in the form of leave by the end of the calendar year will be paid in the following February at the rate of 1.5 times the employee's hourly rate. 3.3 Annual Bonus Employees who work overtime in the following conditions are entitled to claim for meal allowance: - overtime for 3 consecutive hours from Monday to Saturday; - overtime for 6 consecutive hours on Sunday or public holidays Employees who are required to report for duty outside office hours in case of emergency will be entitled to an emergency allowance and compensation leave 4.5 Compensation Leave or Time-off for Overtime Work Employees at grade 4-6 who have worked overtime will be compensated by way of time off. Such compensation leave will only be granted when their department’ s workload permits. Please also refer to “Overtime Compensation” under Section 3.2 of this handbook for details. Answer the following question: Q: what should I do if I worked overtime? A:

You can try a few other forms like:

Answer the following question. If the answer cannot be found in the context, write "I don't know"
Answer the following question. If the answer cannot be found in the context, rewrite the question that is more related to the context by starting with "Do you want to ask" and then elaborate the answer

Send the prompt to GPT-3 to generate an answer

The following code is using the OpenAI API to generate a response for the prompt using GPT-3. GPT-3 will come up with an answer that fits the context:

If you are an employee at grade 7-10, you should obtain prior approval from your supervisor and you will be compensated by pay at the rate of 1.5 times your hourly rate. If you are an employee at grade 4-6, you will be compensated by way of time off. Such compensation leave will only be granted when your department’s workload permits. Any overtime not compensated in the form of leave by the end of the calendar year will be paid in the following February at the rate of 1.5 times your hourly rate.

Create a function that wraps up everything

Wrapping the code into a function called answer(query) allows for easy reuse of the code and inputting of different queries. This function takes in a query string, and return the response of the API. This makes it easy to input different queries and get an answer without having to repeat the previous steps every time.

Test the chatbot by creating a question-answer loop

Here, we have an infinite loop that repeatedly prompts the user for a query, and exits the loop when the user enters "xxx". You can input multiple queries and get answers without having to restart the code. It allows easy testing of the function by allowing the user to input different queries and see the answers generated by GPT-3. The output will be as follows:

Q: do I get paid if I got sick? A: Yes, employees who are not able to report for duty due to illness are entitled to full pay sick leave, subject to a maximum of the lesser of 120 days or the number of sickness days that the employee has accumulated according to provisions in the Employment Ordinance. Additionally, employees who suffer from injury arising out of and in the course of employment are entitled to compensation in accordance with the Employee Compensation Ordinance. ====================================================================== Q: what should I do if I got sick? A: If I become ill, I should notify my immediate supervisor as soon as possible and provide a medical certificate if necessary. I am entitled to full pay sick leave, subject to a maximum of 120 days or the number of sickness days I have accumulated according to the Employment Ordinance. If I have to leave during office hours due to sickness or personal reasons, I must obtain prior approval from either my supervisor, department manager or the Human Resources Department. ====================================================================== Q: xxx

Use Gradio to create an interface for the chatbot

Finally, let's use gradio library to create a (GUI) for the answer(query) function defined earlier: Now you have a more user friendly GUI to ask the questions:

Try it out in Colab:

Friday, August 19, 2022

AppDynamics Agent Instrumentation for Java 6 Applications

AppDynamics Agent can monitor Java 6 applications. However, modern applications does not supporting TLS version older than TLS 1.2. So, the SaaS controller does not support using TLS 1.1, TLS 1.0, and SSH 3. On the other hand, being an 2006 technology, Java 6 does not support TLS 1.2 out-of-the-box.

Upgrading the JVM seems to be an obvious solution. However, it might not be possible as it would involve a lot of effort to re-test. Some library might not even compatible with newer JVM.

Fortunately, we can make use of Java's pluggable cryptography extension (JCE) and Java Secure Socket Extension (JSSE). Bouncy Castle is one of the library to provider of the security feature [ref: stackoverflow.com].

The following Dockerfile demonstrate the following:

Use yannart/jboss-5.1.0.ga-jdk6 as the base image
Copy the content of AppServerAgent to /opt/appdynamics

Java agent is separately downloaded
controller-info.xml shall be configured

Download and install jurisdiction policy files for "unlimited strength" cryptography
Download and install Bouncy Castle 1.66
Copy the java.security file
Configure Java agent instrumentation via environment variable

The changes of the java.security is as follows:

Set the priority of Bouncy Castle as the first provider of JCE and JSSE
Set the default SocketFactory

For what it's worth, here is the full java.security file:

Thursday, June 16, 2022

Parsing Formula with Python

For simple formula parsing, we can make use of ast package in Python. It builds the abstract syntax tree and we can recursively traverse it. Here is an example how to convert the formula in a structured dictionary.

Monday, July 09, 2018

Testing with Intel Movidius Neural Compute Stick

One day, I came across this USB stick.

It is an USB stick with Myriad VPU 2 which is a chip specialized for convolution neural network. It is generally a good news as not all computers come with an expensive graphics card. Some device is not even capable for a graphics card like Raspberry Pi. So, a USB stick seems to be a perfect solution.

It has a FaceNet example for Tensorflow so I decided to try it out. It is target for Ubuntu on PC at the moment and specifically supports only version 16.04. Luckily, Virtual PC is supported as well so it is OK to run on my Windows 10 PC.

The NCSDK v2 installation is pretty painless. Then the NC App Zoo for the FaceNet sample. The run.py in the sample is pretty useless. It is just taking an image without getting the face and resampling it to a 180x180 image and feed it into the model. Obviously, it is not how FaceNet works.

Since the original TensorFlow implementation from David already got a compare.py. So, I copied it and the MTCNN dependencies over and renamed it to compare_tf.py. Then made a copy and updated it with the NCSDK as compare_nc.py. The results as as follows:

compare_nc.py

Images:
0: elvis-presley-401920_640.jpg
1: neal_2017-12-19-155037.jpg
2: president-67550_640.jpg
3: trump.jpg
4: valid.jpg

Distance matrix
        0         1         2         3         4
0    0.0000    0.6212    0.6725    0.7981    0.5387
1    0.6212    0.0000    0.8101    0.7106    0.5050
2    0.6725    0.8101    0.0000    0.6509    0.6273
3    0.7981    0.7106    0.6509    0.0000    0.6946
4    0.5387    0.5050    0.6273    0.6946    0.0000

compare_tf.py

Images:
0: elvis-presley-401920_640.jpg
1: neal_2017-12-19-155037.jpg
2: president-67550_640.jpg
3: trump.jpg
4: valid.jpg

Distance matrix
        0         1         2         3         4
0    0.0000    1.4255    1.3354    1.3078    1.4498
1    1.4255    0.0000    1.5454    1.4255    0.6372
2    1.3354    1.5454    0.0000    1.2032    1.4949
3    1.3078    1.4255    1.2032    0.0000    1.4904
4    1.4498    0.6372    1.4949    1.4904    0.0000

Hmmm, not the same result expected. I think it shouldn't be the hardware failure. It could be there are some conversion problem when converting TensorFlow model to the NCSDK model.

Conclusion: it is quite primitive at the moment. For common models like inception or mobilenet, it might work well. For custom models, good luck.

GitHub Link: https://github.com/compustar/ncappzoo/tree/ncsdk2/tensorflow/facenet

Tuesday, July 14, 2015

Building a Source Code Generator with AngularJS

If you are looking for generating scaffold for AngularJS, this is not what this blog post about. This blog post is about generate source code with AngularJS.

The principle of creating source code generators is pretty simple (details here): a model, a template, and a driver. Given:

HTML5 can access to local files as the data model
AngularJS has its template language
JavaScript is the driver

The following code in JSFiddle takes a tab delimited (3 columns) file to generate a Oracle script:

The syntax highlighting is done by SHJS.

Since it takes any JavaScript object as the model, other potential input formats are JSON and XML (with aid of some library). This is a lightweight cross-platform source code generator.

Friday, May 23, 2014

BizTalk Atomic Update for Side-by-Side Orchestration Deployment

Deploying orchestration side-by-side, you have to unenlist the old version and start the new version. To minimize the down time, the process should be done in atomic way so that the old orchestration would not pick the new messages while the new orchestration instances will be activated by the new messages.

BizTalk exposes the .Net API to manage the orchestrations. The following script demonstrates how to use PowerShell to invoke the API so that unenlisting and starting the orchestrations could be done atomically.

To use the script, update $connectionString and the expressions in $unenlisting and $starting variables so that they could match the assembly qualified names of the orchestrations. The script also takes one argument "test" or "trial" so that you could test out whether the expressions are correct.

### Settings of the script ###
$test = $FALSE
If ($args.Count -gt 0) {
    $test = ($args[0] -eq "test") -or ($args[0] -eq "trial")
}
$connectionString = "SERVER=.;DATABASE=BizTalkMgmtDb;Integrated Security=SSPI"
$unenlisting = `
    @( `
      '^Test.*' `
    )
$starting = `
    @( `
      '.*SubOrchestration.*' `
    )

### Initialize BizTalk catalog explorer ###
add-type -assemblyName "Microsoft.BizTalk.ExplorerOM, Version=3.0.1.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35, processorArchitecture=MSIL"
$catalog = New-Object -TypeName Microsoft.BizTalk.ExplorerOM.BtsCatalogExplorer
$catalog.ConnectionString = $connectionString

### Matching orchestration's assembly qualified name with the regular expressions in $unenlisting and $starting array ###
If ($test) {
    Write-Host "TEST MODE" -foregroundcolor white
    Write-Host
}
Try
{
    $catalog.Applications | ForEach-Object {
        $_.Orchestrations | ForEach-Object {
            $target = $_
### -like is the wildcard operator
### -match is the regular expression operator

#            $unenlisting | Where-Object {($target.AssemblyQualifiedName -like $_) -and ($target.Status -ne [Microsoft.BizTalk.ExplorerOM.OrchestrationStatus]::Unenlisted)} | ForEach-Object {
            $unenlisting | Where-Object {($target.AssemblyQualifiedName -match $_) -and ($target.Status -ne [Microsoft.BizTalk.ExplorerOM.OrchestrationStatus]::Unenlisted)} | ForEach-Object {
                Write-Host "Unenlisting" $target.FullName $target.BtsAssembly.Version "..." -foregroundcolor yellow
                $target.Status = [Microsoft.BizTalk.ExplorerOM.OrchestrationStatus]::Unenlisted
            }
#            $starting | Where-Object {($target.AssemblyQualifiedName -like $_) -and ($target.Status -ne [Microsoft.BizTalk.ExplorerOM.OrchestrationStatus]::Started)} | ForEach-Object {
            $starting | Where-Object {($target.AssemblyQualifiedName -match $_) -and ($target.Status -ne [Microsoft.BizTalk.ExplorerOM.OrchestrationStatus]::Started)} | ForEach-Object {
                Write-Host "Starting   " $target.FullName $target.BtsAssembly.Version "..." -foregroundcolor green
                $target.Status = [Microsoft.BizTalk.ExplorerOM.OrchestrationStatus]::Started
            }
        }
    }
    Write-Host
    If ($test) {
        $catalog.DiscardChanges()
        Write-Host "Changes rolled-back for test mode" -foregroundcolor white
    }
    Else {
        $catalog.SaveChanges()
        Write-Host "Changes committed" -foregroundcolor white
    }
}
Catch {
    $catalog.DiscardChanges()
    Write-Host "Changes rolled-back" -foregroundcolor red
    Write-Host $Error cyan
}
Write-Host

Wednesday, March 27, 2013

Troubleshooting Slow BizTalk Admin Console

Since SQL Server is the backend storage of BizTalk, performance of SQL Server affects the rest of the BizTalk components. When the BizTalk Admin Console is running slowly

Check the sizes of BizTalk databases
Check the connectivity to the SQL Server

Open a connection with SQL Server Management Studio see if it behaves the same
Check for network connectivity
Try out different protocol configurations

Operation guys may comes in and update the SQL's port number to something rather than the default 1433 port leaving your BizTalk server untouched. The SQL client on the BizTalk server falls back to use Named Pipe after the TCP timeout (depends on the priority of protocol).

Try set the Recovery Mode of all BizTalk databases from Full to Simple

Not recommended in production environment
You should know the implication. If you do not, please read MSDN.

Saturday, June 16, 2012

Time-outs

IIS Management Console

Application Pool

Advanced Settings

Idle Time-out

One way to conserve system resources is to configure idle time-out settings for the worker processes in an application pool. When these settings are configured, a worker process will shut down after a specified period of inactivity. The default value for idle time-out is 20 minutes.

Web Site

Advanced Settings

Connection Limits

Connection Time-out
Specifies the time (in seconds) that IIS waits before it disconnects a connection that is considered inactive. Connections can be considered inactive for the following reasons:
- The HTTP.sys Timer_ConnectionIdle timer expired. The connection expired and remains idle.
- The HTTP.sys Timer_EntityBody timer expired. The connection expired before the request entity body arrived. When it is clear that a request has an entity body, the HTTP API turns on the Timer_EntityBody timer. Initially, the limit of this timer is set to the connectionTimeout value. Each time another data indication is received on this request, the HTTP API resets the timer to give the connection more minutes as specified in the connectionTimeout attribute.
- The HTTP.sys Timer_AppPool timer expired. The connection expired because a request waited too long in an application pool queue for a server application to dequeue and process it. This time-out duration is connectionTimeout.
The default value is 00:02:00 (two minutes).

Features View

Configuration Editor

Section = system.web/httpRuntime

executionTimeout

Specifies the maximum number of seconds that a request is allowed to execute before being automatically shut down by ASP.NET.

This time-out applies only if the debug attribute in the compilation element is False. Therefore, if the debug attribute is True, you do not have to set this attribute to a large value in order to avoid application shutdown while you are debugging.

The default is 110 seconds.

Service Configuration Editor (system.serviceModel)

Bindings

closeTimeout
A TimeSpan value that specifies the interval of time provided for a close operation to complete. This value should be greater than or equal to Zero. The default is 00:01:00.

openTimeout

A TimeSpan value that specifies the interval of time provided for an open operation to complete. This value should be greater than or equal to Zero. The default is 00:01:00.

receiveTimeout
A TimeSpan value that specifies the interval of time provided for a receive operation to complete. This value should be greater than or equal to Zero. The default is 00:10:00.

sendTimeout
A TimeSpan value that specifies the interval of time provided for a send operation to complete. This value should be greater than or equal to Zero. The default is 00:01:00.

BizTalk Server Administration Console

BizTalk Group

Platform Settings

Hosts

Settings

General

Response Time [see also]

Specify the default timeout for request response messages.
When an adapter submits a request-request message, it needs to specify the time-out of the request message on the IBTTransportBatch.SubmitRequestMessage Method (COM) API. A response message will be delivered to the adapter only within this time-out period. After the time-out expires, a negative acknowledgment (NACK) will be delivered to the adapter instead of the response message. If the adapter does not specify a time-out value, the engine uses the default value of 20 minutes.

Applications

[Application]

Send Ports/Receive Locations

[Send Port/Receive Location]

Properties

Type = Any WCF Adapter > Configure...

Binding

Wednesday, April 27, 2011

ASP.NET MVC View Engine with FreeMarker

FreeMarker is a quite success template engine in Java world. It is used with Apache Struts, an MVC framework, as a view engine. ASP.NET MVC is using Web Form as the default view engine. The problem is the view become a spaghetti of HTML and C# snippets fairly quickly, just like classic ASP and PHP. FreeMarker.Net is a project that ports the FreeMaker as an ASP.NET MVC view engine

The idea is compiling FreeMarker to .Net assembly with IKVM and create a wrapper to wrap .Net objects so that FreeMarker can understand.

Compiling FreeMarker
It is a strength forward process. It can be done with one command:
ikvmc freemarker.jar -target:library

Separating Necessary IKVM Libraries
Only the following libraries is required for development:

IKVM.OpenJDK.Beans.dll
IKVM.OpenJDK.Charsets.dll
IKVM.OpenJDK.Core.dll
IKVM.OpenJDK.SwingAWT.dll
IKVM.OpenJDK.Text.dll
IKVM.OpenJDK.Util.dll
IKVM.Runtime.dll

Wrapping .Net Object
FreeMarker does not directly deal with the objects. Instead, it deals with the TemplateModel objects. There are a few template models to be implemented:

TemplateBooleanModel
TemplateDateModel
TemplateHashModel
TemplateMethodModelEx
TemplateNumberModel
TemplateScalarModel
TemplateSequenceModel

FreeMarker provides an ObjectWrapper interface to wrap the raw objects into TemplateModel.

The NetObjectModel is actually is a TemplateHashModel

public class NetObjectModel : StringModel, 
                              TemplateModel, 
                              TemplateHashModel {
        Dictionary<string, PropertyInfo> props = 
            new Dictionary<string, PropertyInfo>();
        Dictionary<string, MethodModel> methods = 
            new Dictionary<string, MethodModel>();
        Dictionary<string, ExtensionMethodModel> extensionMethods = 
            new Dictionary<string, ExtensionMethodModel>();

        public NetObjectModel(object data, 
                              NetObjectWrapper wrapper)
            : base(data, wrapper){
            var type = data.GetType();
            foreach (var p in type.GetProperties()) {
                props.Add(p.Name, p);
            }
            foreach (var m in type.GetMethods()) {
                if (!methods.ContainsKey(m.Name)) {
                    methods.Add(m.Name, 
                                new MethodModel(data,  
                                                wrapper, 
                                                m.Name));
                }
            }
        }

        public virtual TemplateModel get(string key) {
            if (props.ContainsKey(key)) {
                return wrapper.wrap(props[key].GetGetMethod()
                                              .Invoke(data, null));
            }
            else if (methods.ContainsKey(key)) {
                return methods[key];
            }
            else if (wrapper.ExtensionTypes.Count > 0) {
                if (!extensionMethods.ContainsKey(key)) {
                    extensionMethods[key] = 
                        new ExtensionMethodModel(data, 
                                                 wrapper, 
                                                 key);
                }
                return extensionMethods[key];
            }
            else {
                return TemplateModel.__Fields.NOTHING;
            }            
        }

        public virtual bool isEmpty() {
            return props.Count == 0;
        }
    }

To adapt the ASP.NET objects, three more template model are created:

HttpApplicationStateModel
HttpRequestModel
HttpSessionStateModel

They are similar but not in the same interface, all of them are like this:

public class HttpApplicationStateModel : NetObjectModel, 
                                         TemplateModel, 
                                         TemplateHashModel {

        public HttpApplicationStateModel(
             object data, 
             NetObjectWrapper wrapper)
            : base(data, wrapper) {
            
        }

        public override TemplateModel get(string key) {
            HttpApplicationStateBase dic = 
                data as HttpApplicationStateBase;
            if (dic.Keys.Cast<string>().Contains(key)) {
                return wrapper.wrap(dic[key]);
            }
            else {
                return base.get(key);
            }
        }

        public override bool isEmpty() {
            IDictionary<string, object> dic = 
                data as IDictionary<string, object>;
            return dic.Count == 0 && base.isEmpty();
        }
    }

View Engine
The view engine is fairly simple, it simply initialize the FreeMarker configuration.

public class FreemarkerViewEngine : VirtualPathProviderViewEngine {
        Configuration config;
        public FreemarkerViewEngine(string rootPath, 
                                    string encoding = "utf-8") {
            config = new Configuration();
            config.setDirectoryForTemplateLoading(
                new java.io.File(rootPath));
            AspNetObjectWrapper wrapper = 
                new AspNetObjectWrapper();

            config.setObjectWrapper(wrapper); 
            base.ViewLocationFormats = 
                new string[] { "~/Views/{1}/{0}.ftl" };
            config.setDefaultEncoding(encoding);
            base.PartialViewLocationFormats = 
                base.ViewLocationFormats;
        }

        protected override IView CreatePartialView(
            ControllerContext controllerContext, 
            string partialPath) {
            return new FreemarkerView(config, partialPath);
        }

        protected override IView CreateView(
            ControllerContext controllerContext, 
            string viewPath, 
            string masterPath) {
            return new FreemarkerView(config, viewPath);
        }
    }

View
FreemarkerView implements IView to render the content. It simply create the dictionary as variable bindings and invoke the FreeMarker.

public class FreemarkerView : IView{
    public class ViewDataContainer : IViewDataContainer {
        private ViewContext context;
        public ViewDataContainer(ViewContext context) {
            this.context = context;
        }

        public ViewDataDictionary ViewData {
            get {
                return context.ViewData;
            }
            set {
                context.ViewData = value;
            }
        }
    }

    private freemarker.template.Configuration config;
    private string viewPath;
    public FreemarkerView(freemarker.template.Configuration config, string viewPath) {
        this.config = config;
        this.viewPath = viewPath;
    }

    public void Render(ViewContext viewContext, 
                        System.IO.TextWriter writer) {
        Template temp = config.getTemplate(viewPath.Substring(2));
            
        Dictionary<string, object> data = 
                        new Dictionary<string, object>{
            {"model", viewContext.ViewData.Model},
            {"session", viewContext.HttpContext.Session},
            {"http", viewContext.HttpContext},
            {"request", viewContext.HttpContext.Request},
            {"application", viewContext.HttpContext.Application},
            {"view", viewContext},
            {"controller", viewContext.Controller},
            {"url", new UrlHelper(viewContext.RequestContext)},
            {"html", new HtmlHelper(viewContext, 
                            new ViewDataContainer(viewContext))},
            {"ajax", new AjaxHelper(viewContext, 
                            new ViewDataContainer(viewContext))},
        };

        Writer output = new JavaTextWriter(writer);
        temp.process(data, output);
        output.flush();
    }
}

Configuration
Configuration is as simple as adding the view engine to the view engine collections.

protected void Application_Start() {
    AreaRegistration.RegisterAllAreas();

    // ****** Optionally, you can remove all other view engines ******
    //ViewEngines.Engines.Clear();
    ViewEngines.Engines.Add(new FreemarkerViewEngine(this.Server.MapPath("~/")));

    RegisterGlobalFilters(GlobalFilters.Filters);
    RegisterRoutes(RouteTable.Routes);
}

Extension Methods
ASP.NET MVC relies on extension method quite heavily. In the NetObjectWrapper, the following code is added to find the extension methods:

public virtual void AddExtensionNamespace(string ns) {
    var types = this.ExtensionTypes;
    foreach (var a in AppDomain.CurrentDomain.GetAssemblies()) {
        try {
            foreach (var t in a.GetExportedTypes()) {
                if (!types.Contains(t) && 
                    t.IsClass && 
                    t.Name.EndsWith("Extensions") && 
                    t.Namespace == ns && 
                    t.GetConstructors().Length == 0) {
                    types.Add(t);
                }
            }
        }
        catch { }
    }
}

In the view engine's constructor, it reads the namespaces specified under system.web/pages sections in web.config:

PagesSection section = 
    (PagesSection)WebConfigurationManager
                      .OpenWebConfiguration("~/")
                      .GetSection("system.web/pages");
if (section != null) {
    foreach (NamespaceInfo info in section.Namespaces) {
        wrapper.AddExtensionNamespace(info.Namespace);
    }
}

An ExtensionMethodModel class is created to find the appropriate method to invoke:

foreach (var type in wrapper.ExtensionTypes) {
    MethodInfo method = type.GetMethod(
        methodName, 
        BindingFlags.Public | BindingFlags.Static, 
        Type.DefaultBinder, 
        argTypes, null);
    if (method != null) {
        cache.Add(key, method);
        return wrapper.wrap(method.Invoke(target, args.ToArray()));
    }
}

Localization
ResourceManagerDirectiveModel (a TemplateDirectiveModel) and ResourceManagerModel are created so that we could do something like this:

<@resource type="Freemarker.Net.MvcWeb.App_GlobalResources.PersonResource, Freemarker.Net.MvcWeb"/>
${res.Title}

ResourceManagerDirectiveModel is getting the ResourceManager from the resource and put it into res template variable:

public void execute(freemarker.core.Environment env, java.util.Map parameters, TemplateModel[] models, TemplateDirectiveBody body) {
    if (parameters.containsKey("type")) {
        TemplateScalarModel scalar = 
            (TemplateScalarModel)parameters.get("type");
        var type = Type.GetType(scalar.getAsString());
        env.setVariable("res", 
            env.getObjectWrapper()
                .wrap(type.GetProperty("ResourceManager", 
                                        BindingFlags.Static | 
                                        BindingFlags.NonPublic | 
                                        BindingFlags.Public)
                        .GetGetMethod(true)
                        .Invoke(null, null)));
    }            
}

Adding More Directives
To all more directives can be added in the future, MEF is used. The directive implementation only needs to export and implement the ImportableDirective interface.

[Export(typeof(ImportableDirective))]
    public class ResourceManagerDirectiveModel : TemplateDirectiveModel, ImportableDirective {

The view engine will import them in the constructor:

public class FreemarkerViewEngine : VirtualPathProviderViewEngine {
        Configuration config;
        [ImportMany]
        IEnumerable<ImportableDirective> directives;
        public FreemarkerViewEngine(string rootPath, 
                                    string encoding = "utf-8") {
            // initialize the config 
            // ...
            // import the extension methods' namespaces
            // ...
            // import the directives
            var dir = new DirectoryInfo(
                Path.Combine(rootPath, "bin"));
            var catalogs = dir.GetFiles("*.dll")
                .Where(o => o.Name != "freemarker.dll" && 
                           !(o.Name.StartsWith("IKVM.") || 
                           o.Name.StartsWith("Freemarker.Net")))
                .Select(o => new AssemblyCatalog(
                             Assembly.LoadFile(o.FullName))
            );
            var container = new CompositionContainer(
                new AggregateCatalog(
                    new AggregateCatalog(catalogs),
                    new AssemblyCatalog(
                      typeof(ImportableDirective).Assembly)
            ));
            container.ComposeParts(this);
        }

The source code is available in CodePlex. Please visit http://freemarkernet.codeplex.com/.

P.S.: The bonus of this project is we have not only a new view engine in ASP.NET MVC, but also a new template engine in .Net Framework. Maybe someday we could use FreeMarker to generate code instead of T4 in Visual Studio SDK.

Thursday, April 21, 2011

Caching in .Net Framework 4

We used to have caching in ASP.NET. For non-web application caching, Cache Application Block from Enterprise Library may be the choice. In .Net Framework 4, caching is baked into the library and no longer limited to ASP.NET.

After adding System.Runtime.Caching assembly as the reference in your project, we can cache the data retrieved by a web services:

using System.Runtime.Caching;
class CrmHelper{
    static ObjectCache cache = MemoryCache.Default;
//...
    string key = string.Format("{0}/{1}", 
                          "account", "address1_addresstypecode");
    MetadataService service = new MetadataService();
    service.Credentials = CredentialCache.DefaultCredentials;

    Option[] options = null;
    if (cache.Contains(key)){
        options = (Option[])cache[key];
    else {
        var picklist = (PicklistAttributeMetadata)
            service.RetrieveAttributeMetadata(
                "account", "address1_addresstypecode");
        options = picklist.Options;
        cache.Set(key, options, new CacheItemPolicy {
            AbsoluteExpiration = DateTimeOffset.Now.AddMinutes(2)
        });
    }
//...
}

if (cache.Contains(key)) ... cache.Set(key, data) is going to be a repeating pattern. We could write an extension method to wrap it up:

public static class Extensions {
    public static T Get<T>(this ObjectCache cache, 
                           string key,
                           Func<T> retrieveFunction) {
        if (cache.Contains(key)) {
            return (T)cache[key];
        }
        else {
            var result = retrieveFunction();
            cache.Set(key, result, new CacheItemPolicy {
                AbsoluteExpiration = 
                    DateTimeOffset.Now.AddMinutes(2)
            });
            return result;
        }
    }
}

The previous code is now cleaned up like this:

using System.Runtime.Caching;
class CrmHelper{
    static ObjectCache cache = MemoryCache.Default;
//...
    string key = string.Format("{0}/{1}", 
                          "account", "address1_addresstypecode");
    MetadataService service = new MetadataService();
    service.Credentials = CredentialCache.DefaultCredentials; 

    Option[] options = cache.Get(key, () => {
        var picklist = (PicklistAttributeMetadata)
            service.RetrieveAttributeMetadata(
                "account", "address1_addresstypecode");
        return picklist.Options;
    });
//...
}

It is worth to note that the Cache Application Block from Enterprise Library is deprecating (http://msdn.microsoft.com/en-us/library/ff664753(v=PandP.50).aspx):

Caching Application Block functionality is built into .NET Framework 4.0; therefore the Enterprise Library Caching Application Block will be deprecated in releases after 5.0. You should consider using the .NET 4.0 System.Runtime.Caching classes instead of the Caching Application Block in future development.

If you only want a quick memory cache, the new caching API might already fit your needs. However, Enterprise Library provides some other implementations of caching (e.g. database, file, etc). If you want something more fancy, Enterprise Library could be an alternative.

Refereces:

Sunday, March 27, 2011

WCF Channel Factory and Unity 2.0

Unity 2.0 does not support ChannelFactory in configuration natively. However, we could extend Unity. Inspired by Chris Tavares's FactoryElement extension for static factory configuration, here is the ChannelFactory extension for WCF channel factory configuration in Unity.

The element will get the generic type of the ChannelFactory and calling the CreateChannel() method upon the injection.

public class ChannelFactoryElement : InjectionMemberElement {
        private const string endpointConfigurationNamePropertyName = "endpointConfigurationName";
        private static int numFactories;
        private readonly int factoryNum;

        public ChannelFactoryElement() {
            factoryNum = Interlocked.Increment(ref numFactories);
        }

        [ConfigurationProperty(endpointConfigurationNamePropertyName, IsKey = false, IsRequired = true)]
        public string Name {
            get { return (string)base[endpointConfigurationNamePropertyName]; }
            set { base[endpointConfigurationNamePropertyName] = value; }
        }

        public override string Key {
            get { return "ChannelFactory " + factoryNum; }
        }

        public override IEnumerable<injectionmember> GetInjectionMembers(IUnityContainer container, Type fromType,
            Type toType, string name) {

            Type factoryType = typeof(ChannelFactory<>).MakeGenericType(toType);            
            var constructor = factoryType.GetConstructor(new Type[] { typeof(string) });
            var factory = constructor.Invoke(new object[]{Name});
            var method = factoryType.GetMethod("CreateChannel", Type.EmptyTypes);
            return new InjectionMember[] { new InjectionFactory(o => method.Invoke(factory, null)) };
        }

    }

The extension will register the factory as the element name of ChannelFactoryElement.

public class ChannelFactoryConfigExtension : SectionExtension {
        public override void AddExtensions(SectionExtensionContext context) {
            context.AddElement<ChannelFactoryElement>("factory");
        }
    }

The factory element uses endpointConfigurationName to retrieve the client's endpoint as in ChannelFactory(endpointConfigurationName) constructor.

<unity xmlns="http://schemas.microsoft.com/practices/2010/unity">
    <sectionExtension type="MyNamespace.ChannelFactoryConfigExtension, MyAssembly" />
    <container>      
      <register type="MyNamespace.IMyService, MyAssembly">
        <factory endpointConfigurationName="testing"/>
      </register>
    </container>
  </unity>

Friday, March 25, 2011

PHP-Azure Migration - Part 4

Up to the last post of the migration, we have done all data and file migration. How about the new files being uploaded from the CMS?

PHP
We are using a very old version of FCKEditor as our rich text editor. Since FCKEditor supports uploading file from the editor, we need to change the code in editor/filemanager/connectors/php/command.php. Here is an example of getting the folder and files for the file browser.

function GetFoldersAndFiles( $resourceType, $currentFolder ) {
        $sServerDir = MapAzureFolder( $resourceType, $currentFolder);
        $blobs = $blobStorageClient->listBlobs($ContainerName, $sServerDir, '/');
 $aFolders = array();
 $aFiles = array();
 $folders = GetSessionFolders($resourceType, $currentFolder);
 foreach($blobs as $blob){
  $name = GetName($blob->Name);
  if ($blob->IsPrefix){
   $folders[$name] = 0;
  } else {     
   $iFileSize = $blob->Size;
   if ( !$iFileSize ) {
    $iFileSize = 0 ;
   }
   if ( $iFileSize > 0 ) {
    $iFileSize = round( $iFileSize / 1024 ) ;
    if ( $iFileSize < 1 ) $iFileSize = 1 ;
   }
    $aFiles[] = '<File name="' . ConvertToXmlAttribute( $name ) . '" size="' . $iFileSize . '" />' ;
  }
 }
 foreach($folders as $name => $value){
  $aFolders[] = '<Folder name="' . ConvertToXmlAttribute( $name ) . '" />' ;
 }
}

It is worth to mention the GetSessionFolders() function. Azure Storage does not have concept of folder. All folders are actually derived by the paths of the files. So, when listing the blobs, we can use $blob->IsPrefix to distinguish the derived folders and the files.

If a user want to create a folder and upload the file to the folder, we need to put the folder into the session. That is how GetSessionFolders() function come from.

When upload a file, we have to handle the Flash format. It seems that modern browsers are intelligent enough to recognize image files like JPEG, GIF, PNG without the content type. When it comes to Flash file, it could not be played properly.

function FileUpload( $resourceType, $currentFolder, $sCommand ) {
...
 $sFileName = 'uploadedfiles/'. strtolower($resourceType).$currentFolder.$sFileName;
 $result = $blobStorageClient->putBlob($ContainerName, $sFileName,  $oFile['tmp_name']);

 if(strtolower($resourceType) == 'flash') {
  $contentType = 'application/x-shockwave-flash';
  $blobStorageClient->setBlobProperties($ContainerName,  $sFileName, null, array('x-ms-blob-content-type' => $contentType));
 }
 $sFileUrl = $currentFolder.$sFileName;
...
}

ASP.NET
For the ASP.NET portal, we have to do the same. Under /editor/filemanager/connectors/aspx/connector.aspx, we changed the first line to

<%@ Page Language="c#" Trace="false" Inherits="FCKAzureAdapter.Connector" AutoEventWireup="false" %>

The connector is simply extending FredCK.FCKeditorV2.FileBrowser.Connector and override the GetFiles(), GetFolders(), and CreateFolder() methods with Azure API. Here is an example of GetFolders():

protected override void GetFolders(System.Xml.XmlNode connectorNode, string resourceType, string currentFolder) {
    string sServerDir = this.ServerMapFolder(resourceType, currentFolder);

    // Create the "Folders" node.
    XmlNode oFoldersNode = XmlUtil.AppendElement(connectorNode, "Folders");

    CloudBlobContainer container = GetCloudBlobContainer(containerPath);
    var dir = container.GetDirectoryReference(folder);

    var items = dir.ListBlobs();
    List<string> dirList = new List<string>();
    foreach (var x in items.OfType<CloudBlobDirectory>())
    {
        dirList.Add(x.Uri.ToString());
    }
    var dirs = dirList.Select(o => o.Substring(o.LastIndexOf('/', o.Length - 2)).Replace("/", "")).ToArray();

    Dictionary<string, List<string>> folderList = 
            HttpContext.Current.Session["Folders"] as Dictionary<string, List<string>>;
    foreach (var dir in dirs)
    {
        // Create the "Folder" node.
        XmlNode oFolderNode = XmlUtil.AppendElement(oFoldersNode, "Folder");
        var dirName = dir.Uri.GetName();
        XmlUtil.SetAttribute(oFolderNode, "name", dirName);
        if (folderList != null && 
            folderList.ContainsKey(resourceType) 
            && folderList[resourceType].Contains(dirName))
        {
            folderList[resourceType].Remove(dirName);
        }
    }

    if (folderList != null && folderList.ContainsKey(resourceType))
    {
        foreach (var folder in folderList[resourceType].Where(o => o != currentFolder && o.StartsWith(currentFolder)))
        {
            var folderName = folder.Substring(currentFolder.Length).Trim('/');
            if (!folderName.Contains('/')) {
                // Create the "Folder" node.
                XmlNode oFolderNode = XmlUtil.AppendElement(oFoldersNode, "Folder");
                XmlUtil.SetAttribute(oFolderNode, "name", folderName);
            }
        }
    }
}

Then, we do the same for editor/filemanager/connectors/aspx/upload.aspx.

Next time, we will discuss what other problems we have encountered.

Thursday, March 03, 2011

PHP-Azure Migration - Part 3

In the previous post, we have fixed all database queries to make the web site start working in SQL Server. The next task is tackling the file I/O without the file system. The file system is not a permanent storage in Windows Azure. We need to copy all resource files (file attachments of articles) from the local file system to Azure Storage.

File Migration
The migration seemed quite straight forward. We can just copying the file to Azure Storage. However, there is no FTP or any batch utilities from Microsoft for batch upload. Given we have 4GB of files with some hierarchical directories, it is infeasible to the files one by one. We used Cloud Storage Studio for the migration.

URL Redirection
The web site and the resource files are now in different servers. We can no longer use relative path. However, since the database is storing the relative path of the resource files, we either change the data (as well as semantics) to store the absolute URIs of the resource files; or redirect the browser to get the files from the correct URLs.

As changing the data would involve a lot of testings in business logics, we decided to create a ASP.NET Module to redirect the requests. Given that

IIS (as well as Windows Azure) allow us to call ASP.NET Module in any web site (including PHP web site)
All files were stored in a particular folder so we could derive the redirect pattern easily

Now, all existing files have been migrated and will be redirected to new URL. Next time, let's take a look how to deal with new uploaded files.

Wednesday, March 02, 2011

PHP-Azure Migration - Part 2

In the previous post, I have discussed the database migration from MySQL to SQL Azure. Data migration is only the first part of data processing, the next step is to modify the queries in the web site due the the syntax differences between two database systems.

Driver Adaption
In the ASP.NET frontend, we only need to replace the ADO.NET driver from MySQL to SQL Server driver and it is all done. In the PHP CMS, however, we will need to do some adaptation since PHP did not have a unified API for database driver.

Lucky enough, it is not much work as there is a driver layer in the CMS. There were only three functions we need to change: query, fetch, total which corresponding to mysql_query, mysql_fetch_arry, and mysql_num_rows in MySQL driver and sqlsrv_query, sqlsrv_fetch_array, and sqlsrv_num_rows in SQL Server.

Query Modification
The major differences can be summarizes as follows:

Getting last inserted record identity
Paging
Scalar functions
Aggregation functions
Escaping strings

There is a DAO layer in the CMS that storing all queries. So, all we did is go through all queries in the DAOs and modify them.

1. Getting last inserted record identity
MySQL PHP driver provides mysql_insert_id() to get the auto-increment column value of the last record inserted. There is no equivalent function in SQL Server PHP Driver. However, it is no more than just querying "@@IDENTITY" in SQL Server.

2. Paging
While doing paging in MySQL only takes "LIMIT n OFFSET skip", SQL Server will need

SELECT * FROM (
    SELECT TOP n * FROM (
        SELECT TOP z columns      -- (where z = n + skip)
        FROM table_name
        ORDER BY key ASC
    ) AS a ORDER BY key DESC
) AS b ORDER BY key ASC

3. Scalar functions
MySQL uses NOW() to get the current timestamp. SQL Server uses GETDATE().
MySQL provides RAND() so we could do random order. SQL Server provides NEWID(). Please note that NEWID() generate a GUID so the ordering is random. It does not mean NEWID() is same as RAND() in MySQL.

4. Aggregation functions
MySQL provides many convenient aggregation functions like GROUP_CONCAT(). The queries have to be rewritten for SQL Server or we need to create the corresponding user defined aggregation functions.

5. Escaping strings
Most developers use \' to escape single quote. In fact, both MySQL and SQL Server use two consecutive single quotes to escape a single quote. MySQL just accepting the other formats.

Up to this moment, we discussing the data layer (if you have a data layer in PHP). There is no code change in the business layer yet. This is also an example that how we could modularize a program if we partition layers clearly regardless which platform we are using.

Tuesday, March 01, 2011

PHP-Azure Migration - Part 1

Azure is Microsoft's cloud platform. The working models are different in a few ways. We migrated a web site to Azure. The web site consists of a PHP CMS, ASP.NET frontend, and a MySQL database.

Database Migration
There was some discussion among the team about whether using MySQL worker role or SQL Azure. As the final decision is SQL Azure, the first task we need to do is migrating the MySQL database to SQL Azure. The most obvious tools we can work on is Microsoft SQL Server Migration Assistant (SSMA) for MySQL.

SSMA was working fine in SQL Server. It reconstructed the SQL Server equivalent schema from MySQL and copied the data from MySQL to SQL Server. However, there are a few catches when migrating to SQL Azure:
1. Tables without primary keys:
For some reason, some tables in the web site did not have primary keys. The problem is a table in SQL Azure need a clustered index. No primary key means no clustered index (in SSMA sense). We have used a development SQL Server to re-create the schema first. Then, look for the tables without primary keys and add the primary keys back.

2. Large records
The CMS designer has a concept that all data should be put into the database as the backup strategies. So, everything including images files, Flash video files, and video files were stored in some attachment table as well. SSMA could handle most of them but not all. However, when the record is large (say 10MB), it will eventually failed and interrupted the how migration process. It is a very traditional timeout problem. We are just dealing with two:

Command timeout for SQL Server
Idle timeout for MySQL

Since we do not have control for timeout configuration and error handling strategies, we skip those tables during the migration. We dumped out the attachment into file systems. After a few tuning in the upload and download script to make the CMS working, we create a PHP script to migrate only the metadata columns in the attachment tables.

To conclude, the migration were performed in a two steps:

Execute the SQL script to create the tables
SSMA to copy most data
Execute the PHP script to migrate the attachment metadata

The performance was quite satisfactory.

Monday, November 15, 2010

Jeans - Running Java Servlet and JSP Webapp in ASP.NET/IIS - Part 4

Retrieving a database connection is one of the most popular problem in software development. Usually, we specify the connection string in a configure file and create the database connection with the connection string. The Java way is wrapping connection string in a javax.sql.DataSource interface and put in the application server config. The application server will put the DataSource instance into a JNDI context. So, this is what should be supported.

Bring in the Unity
Microsoft has a Dependency Injection container called Unity. Even though I am not doing dependency injection, I'd like to configure how to instantiate the data source in web.config. Instead of writing my own configuration classes, Unity is a good alternative as an object builder.

Before doing any configuration, the JDBC driver has to be compiled into .Net assembly since it will be loaded by Unity instead of the Java class loader.

Method Injection
When configuring Unity, we could specify the methods to be called during the instance initialization. The first attempt:

<method name="setURL">

    <param name="url" value="..." />

</method>

However, Unity was smart enough to know there is no such parameter name. I fired-off the object explorer to look the actual parameter name. It seemed that IKVM compiled all parameter name with str, str1, str2, etc. The configuration has to be like this:

<method name="setURL">

    <param name="str" value="..." />

</method>

JNDI Context
Having known that a data source with the name jdbc/testDS is the same as java:comp/env/jdbc/testDS, I tried to mimic the behavior with the reference of this article. It turned out the result is an exception:

javax.persistence.PersistenceException: Exception [EclipseLink-7060] (Eclipse Persistence Services - 2.0.2.v20100323-r6872): org.eclipse.persistence.exceptions.ValidationException

Exception Description: Cannot acquire data source [jdbc/testDS].

Internal Exception: javax.naming.NameNotFoundException: Name jdbc is not bound in this Context

    at org.eclipse.persistence.internal.jpa.EntityManagerSetupImpl.deploy(EntityManagerSetupImpl.java:399)

So, I reverted to the simpliest way by just creating the jdbc subcontext in the initial context.

Password
The configuration was set. Time to run the test. It did not work:

javax.persistence.PersistenceException: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.0.2.v20100323-r6872): org.eclipse.persistence.exceptions.DatabaseException

Internal Exception: com.microsoft.sqlserver.jdbc.SQLServerException: Login failed for user 'jsptest'.

Error Code: 18456

    at org.eclipse.persistence.internal.jpa.EntityManagerSetupImpl.deploy(EntityManagerSetupImpl.java:399)

Well, the password was copied from persistence and very simple:

<method name="setPassword">

    <param name="str" value="passw0rd" />

</method>

Not sure what happened but when the user and password were put in the URL, it worked.

The new source code has been checked-in in http://jeans.codeplex.com/. The persistence.xml has changed to use the <non-jta-data-source>jdbc/testDS</non-jta-data-source>.

Note: sqljdbc4.jar has been removed from the source repository. Please download it from Microsoft and compile it with IKVM to run the JSP demo site.