Tuesday, December 7, 2021

Royalty-Free PDF Conversion and Manipulation Web Service

PDF generation, anyone? There are tons of libraries to do all sorts of things with PDF. But time and again I witness the pain that PDF handling task causes organizations, even though it should long be totally mundane, one would think... 

Why is this so? You have to pay for a good PDF tool. Sometimes a lot. And the many free alternatives either focus on narrow things, or do not produce the best quality, or are hard to use, or are not for the platform of choice at the organization. The biggest obstacle is of course the licensing. If you cannot get a completely free license, then there is often a lot of friction  when trying to get a paid-for variant... Here is one example of how a conversation could evolve: 

- "We already have a license for PDF converter X!". 

- "But it does not do what we need/or it is too hard to use!" etc., etc. 

And needless to say that this sort of complexity typically arises in the kind of organizations that do a lot of document management and depend on PDF generation.

Having been through this recently myself I have got inspired to try and help my customer, Transport Canada, and other teams that may find themselves in a similar situation, and share a way to reliably generate high quality PDF documents and do some basic manipulations on them, absolutely free of charge.

Indeed, there has long been several useful PDF utilities made available on Linux, and there is the prominent LibreOffice suite that can generate great quality PDF documents free of charge. So I thought why not give it a try and put these applications inside of a Docker container and write a service that would accept HTTP requests and launch them? 

Such service would allow tapping into the richness of available open source tools, many of which have been out there for decades, and if you are a developer then it would let you easily adjust which specific tools would you want to run, or how would you want to scale the service. Pretty flexible. 

The service is now being adopted by Marine department at Transport Canada, and I hope it will evolve and serve them well. And since Transport Canada has a great policy of sharing the source code for some of their applications, I am happy to share a link to its public Github repository, which also has a detailed description of how it works and how to handle it. I will just list its "no frills" but much sought-after basic  capabilities here:

  • Conversion of popular office and image formats to PDF (thanks to LibreOffice!)
  • Merging of office documents and images into a single PDF document (thanks to GhostScript!)
  • Populating and "flattening" of fillable PDF forms (thanks to pdftk-java and PDF Toolkit!)

Have fun converting to PDF for free!


 

Friday, May 21, 2021

Migrate Data from a Cosmos DB Azure Table API

If you need to migrate data from or into Azure Cosmos DB you can use Microsoft’s data migration tool to do this. The tool is versatile but the documentation isn't providing all the answers. Specifically, in a scenario when you need to migrate data from Cosmos DB instance configured as an Azure Table API to a JSON file, you can use the tool, but the settings you need to provide are not obvious. Here are the settings which worked for me for its two main tabs: Source Information and Target Information (you would see them if you run the tool dtui.exe):

Source Information

  1. Prepare data for assembling a connection string:

    1. Grab the value of Azure Table Endpoint from Overview page, for example: https://name-of-your-cosmos-db-account.table.cosmos.azure.com:443/

    2. Modify this URL, replacing table.cosmos.azure.com with documents.azure.com

    3. Grab the value of PRIMARY KEY from Connection String page

    4. Grab the name of the root node on the Data Explorer page, this will be your database name.

2. Assemble the connection string as follows:

AccountEndpoint=https://name-of-your-cosmos-db-account.documents.azure.com:443/;AccountKey=primary-key-goes-here;Database=Your-DB-Name

3. Expand the root node (Your-DB-Name) and take a note of a table you want to export, for example My-Table-Name

4. Fill in the form on the Source Information tab:

Import from: Azure Cosmos DB

Connection String: use connection string created in step 2. Click Verify button, it should work.

Collection: My-Table-Name

Other fields: leave them to defaults, or you can optionally specify a query to limit the export

5. Click Next button to configure Target Information

Target Information

  1. Export To: JSON file

  2. Choose Local File radio button option, specify path, optionally select Prettify JSON.

  3. Click Next to complete the wizard and run through the export.

The JSON file should be saved in the directory you specified, or if you didn’t - in the folder you have started the data management tool from.

UPDATE: importing from JSON into Azure Table storage in Cosmos DB also works. Same manipulations with connection string as described above for exporting scenario are needed. In addition, depending on your situation you may want to fill out extra parameters describing whether to regenerate Ids or not, etc. The tutorial on using the data migration tool covers these well.

Monday, January 4, 2021

Command-Line Utility to Validate LUDOWN Files

Microsoft has released Bot Framework Composer in May 2020, and since then the tool was under active development. It allows to rapidly create rich conversational bots that leverage adaptive dialogs, language generation, skills and more.

In working with the Composer I have found that one of practical challenges was the "teaching" LUIS to recognize intents and entities: the labelling was quite verbose, the number of training examples was in dozens per intent (at least), and on top of that the documentation of the new LUDOWN format could be improved, even though the format is by far more convenient than JSON.

I've done some digging thanks to Composer being released as open source, and found a library that the Composer uses for parsing and validating the .lu files: @microsoft/bf-lu.

I thought it was easier to validate the .lu files in CLI mode, so I ended up writing my own CLI to help with quick and verbose validation. As a "bonus" it can also create a temporary LUIS app from an .lu file since the @microsoft/bf-lu validation may miss certain errors, which LUIS would complain about when attempting to create an app.

Here is the utility: https://www.npmjs.com/package/@softforte/lu  

While the new Bot Framework CLI offers similar and comprehensive features, I still find this utility handy for day-to-day LUIS development and hope that it will save you some troubleshooting time. Check it out!


Sunday, November 24, 2019

Single Sign-On for Two Angular Apps with Local Accounts in Azure B2C Tenant

In this day and age Single Sign-On (SSO) is thought of as a commodity, a "flag" an admin turns on somewhere, which makes logging into multiple related applications automatic to the end user. Indeed, mainstream identity providers support SSO for many protocols and across them for several years now.

That's the mindset I had when I was approaching the SSO configuration in Azure B2C tenant. It ended up being a much more cumbersome task than I have expected, hence this post. While in a way it is a regurgitation of information already available on the subject on the Internet, I hope that the description of my "SSO journey" that follows will help reducing the research and experimentation time associated with SSO setup in Azure B2C that otherwise may be needed in order to get it working.

Applications and SSO objective

I have two Angular 8 SPA applications hosted independently on two different domains app1.mydomain.com and app2.mydomain.com. I needed SSO between them, so that when a user signs into one, and then browses to another either in the same browser tab or in a new tab, the user should not be prompted for credentials.
Both applications are registered in the same Azure B2C tenant, and use the same policy. Importantly, they only use local accounts for authentication, this was my constraint. I use MSAL library for authentication/authorization. The application is redirecting users to the B2C policy's sign-in page.

What I wish have worked but didn't...

So I have started with using the built-in Sign up and Sign in user flow, also tried Sign up and Sign in v2 flow with same results. If you go to properties of your flow in B2C web UI, there is a Single sign-on configuration setting under Session behavior. I've set it to Policy as I had two applications sharing the same policy, then saved the user flow. 


It is when there was still no single sign-on I have realized that I was up for a longer ride here.

What worked, but was the wrong path

MSAL documentation describes the library's support for SSO. There are two ways to indicate SSO intention to MSAL library: by using login hint or session identifier (SID). Obviously the MSAL library supports this because the underlying identity provider (IdP) does, or it would be pointless.
So the idea here is to log in to the first application with user's credentials, then pass the SID or login hint to the second application, and B2C should authenticate the user to the second application without displaying prompts.

Cannot obtain SID from Azure B2C

I tried hard, but could not find a way to get SID value from the Azure B2C IdP. I would think it is a claim emitted by the IdP in response to a successful sign on, which appears to be the case for Azure AD IdP, but I had not much luck with Azure B2C IdP.

Extra call to obtain login hint value

The other option, the login hint I could work with. Just get the login claim from the identity or access JWT token returned by B2C and use it as a hint, right? Well, to my surprise the login claim was not present in JWT tokens returned by B2C IdP configured with a built-in Sign up or sign in policy.
That's OK, we can make an MS Graph profile API call and get our login that way, paying with a few hundred milliseconds of page load time for this. Hmmm.....

MSAL Hurdles

It is logical to start with MSAL-Angular if you are in an Angular application... Unfortunately the library is behind the MSAL core, and when it comes to SSO, and specifically passing on login hint, it just does not work.
While the MSAL Angular is appending the login hint as a login_hint extra query parameter to the IdP call, the core Angular library expects the hint as a property of the AuthenticationParameters object. This results in ServerRequestParameters.isSSOParam() call returning false, resulting in the core MSAL library not understanding the login hints and not attempting to establish SSO.
I had to refuse from relying on MSAL-Angular and interact directly with MSAL core library. This got it to work, but as we will see later on, MSAL-Angular "will be back" on the scene.

Sharing the Login Hint between Apps

OK, if I hardcode the login name as a login hint for the second application, then it works, I get the single sign-on as advertised, (or almost!) Now the challenge is to grab the username obtained upon successful logon to the first application through the Graph API call, and share it with the second application before the user authenticates to it. 
Since the apps are on separate domains they do not see each other's state, even if it is in localStorage. Probably the simplest way around this is by using messaging API to communicate between the current window of the first app and a hidden iFrame pointing to the second app, making the latter set the username in its localStorage in response to a received message to use later on as a login hint. Here is an example of this technique
At this point, the whole process was feeling too fragile and complex to me for what it does: too many obstacles, as if Microsoft was trying to implicitly warn me against this path "hinting" that there was a better way. 

PII and Sign Out Concerns

And should I persevere and get over the MSAL-Angular incompatibility, the login hint sharing complexity, and accept the extra time that it takes to make a profile Graph call, I would still face the following issue: the login hint that I am sharing between the applications is what is classified as Personally Identifiable Information (PII). Immediately this becomes a concern from compliance perspective.
Last but not least there is a sign out complexity here: since in the above approach I store the login hint in localStorage, I need to make sure to clear it when a user signs out, or closes her browser tabs. 
Under the pressure of the above considerations, which would have turned a seemingly simple identity solution to a needlessly complex subsystem with potential vulnerabilities, I had to look for an alternative.

Custom Identity Experience Framework Policies to the Rescue

Once I've understood that I've exhausted the options available in the built-in policies (or user flows as they are also referred to), I had to turn to custom Identity Experience Framework (IEF) policies.
First things first, to take advantage of custom policies, one needs to follow this Azure B2C preparation guidance word for word to get the environment ready for creation of custom policies.
Next, make sure to configure Azure Application Insights for monitoring B2C custom policies, as otherwise it will be quite hard to troubleshoot them.

Get signInName Claim in Access Token

I was looking for a way to avoid having to make the MS Graph call. I came across this great StackTrace thread, which shows how to emit the signInName claim as a part of access and id tokens for the local Azure B2C accounts
The detailed instructions in the thread allow adding a signInName claim to the tokens, which is quite helpful. And if you like me happen to hit the following error in process of getting it to work:

 Orchestration step '1' of in policy 'B2C_1A_signup_signin of tenant 'xxxxxxxxxx.onmicrosoft.com' specifies more than one enabled validation claims exchange

Then the following thread contains the remedy

Single Sign-On "Just Works"

Yes it just works as a much welcomed side effect. it was not obvious to me, as the thread was solving a different issue, namely the lack of username in the claims. I did have to modify one line in the SelfAsserted-LocalAccountSignin-Username Technical Profile in TrustFrameworkExtensions.xml (see the highlighted line below):


This is all that I had to do. Now:
  • There is no need to share login hints and deal with associated compliance risks
  • There is no need to make MS Graph API calls and deal with latency
  • MSAL-Angular library "is back in the picture" and can be used again.
Life is good!

Tuesday, November 12, 2019

MSAL acquireTokenSilent() and Azure B2C Permission Scopes

One thing that was not obvious to me when securing an Angular app with Azure B2C tenant had to do with using permission scopes.

Let's say that you have authenticated through loginRedirect(), but need to make a call to acquireTokenSilent() MSAL API from within your SPA app. Perhaps you are writing your own route guard or something... You need to pass an array of scopes to the method call. There are two ways to get this to work: 

1. When you register your app in Azure B2C, it creates a scope for it named user_impersonation. You can take its value (https://yourdomain.onmicrosoft.com/your-app-name/user_impersonation) and pass it to the acquireTokenSilent()method as a single-item array. Or you can create your own scope instead...

You may get an error back from the B2C when you call acquireTokenSilent() with this scope: AADB2C90205: This application does not have sufficient permissions against this web resource to perform the operation. To fix it you need to grant admin consent to the scope through the B2C tenant.

2. There is another way. Check out how MsalGuard class is implemented. It calls acquireTokenSilent()with a single-item array consisting of the app's clientId which we've got through the app registration. That works without any additional consents.

So both ways work, but there are important differences between them:

In the former case, we are making a call to the https://yourdomain.b2clogin.com/yourdomain.onmicrosoft.com/yourpolicy/oauth2/v2.0/authorize endpoint and pass 3 space-separated values in the scope query string argument: https://yourdomain.onmicrosoft.com/your-app-name/user_impersonation openid profile.

In the latter case, the call to the endpoint is not made at all in my case. MSAL "knows" it is authorized as it has got the access token from preceding call to loginRedirect(). Actually, let's take a look at what Fiddler shows when we call loginRedirect(), specifically I am interested in which scopes it passes on:

  • In the former case it is https://yourdomain.onmicrosoft.com/your-app-name/user_impersonation openid profile
  • In the latter case, it is only openid profile

Here is a good description of the meaning of these scopes

With that, here is my takeaway: MSAL converts the clientId scope we pass in a call to its loginRedirect(), acquireTokenSilent() etc. calls to the openid and profile scopes known to Microsoft Identity Platform. It then also is smart enough to resolve calls for access token locally as long as it is valid. 

We can also present our SPA app as an API to the identity platform, create a permission for it, consent it, then acquire token for accessing it. But in a basic authentication scenario such as "is user logged in or not?", there is no benefit in doing so. It may be useful if we have complex permissions in our application and want to be dynamically calling different permission scopes we define for various parts of our application.

Wednesday, June 26, 2019

Extract and Inspect All SharePoint Solutions with PowerShell

Migration or upgrades of SharePoint content databases commonly involve provisioning of WSP solutions. At times you may find yourself in need to search for a particular feature GUID, which is burried somewhere inside of one of the dozens solution files that you have extracted from a farm in question.

If you are on Windows Server 2012 or higher, you can leverage expand.exe command to extract CAB files (WSP files are CAB files).  Here is an one-liner PowerShell command to extract contents of your WSP solutions to respective folders:

dir *.wsp | % { New-Item -Type directory -Path ".\$($_.Name.Remove($_.Name.Length - 4))"; expand.exe $_.Name -F:* $_.Name.Remove($_.Name.Length - 4)}

How to use: First place your solutions to a folder, CD to it, then run the above command, which will create a folder per solution extracted and dump its contents in there.

Now you can quickly tell whether the feature Id you are after is among the ones extracted. For example, the following one-liner command will list all feature Ids, Titles as well as paths to Feature.xml files in a table format:

dir Feature.xml -Recurse | % { $path=[system.io.path]::combine($_.Directory, $_.Name); [xml]$doc = Get-Content -Path $path; $obj = New-Object PSObject -Property @{Path=$path; Id=$doc.Feature.Id; Title=$doc.Feature.Title;}; $obj} | select Id, Title, Path

Oh, and almost forgot that this may also be handy: you can use this line to dump all farm solution files to your current directory, once you make sure you are running it inside of elevated SP PowerShell session:

(Get-SPFarm).Solutions | ForEach-Object{$var = (Get-Location).Path + "\" + $_.Name; $_.SolutionFile.SaveAs($var)}

Happy migrating!

Sunday, June 16, 2019

Azure AD Authentication and Graph API Access in Angular and ASP.NET Core

Wow, it's been quiet here... Enough with the intro ;) and onto the subject, which I find interesting and worthy of writing about...

Consider this scenario, which I think makes a lot of practical sense: a web single-page application (SPA) authenticates users against Azure AD using OpenID Connect implicit grant flow. Then some of the SPA's client-side components make queries to Graph API, while others hit its own server-side Web API.

What follows is highlights from my experience implementing this scenario. These are the packages I was using:
Client-side components obtain access tokens from Azure AD and pass them along with calls to MS Graph API, or to the ASP.NET Web API. The former case is standard and well-explained, while the latter one is less so, and therefore more interesting. ASP.NET is configured to use bearer token authentication and creates user identity, which the rest of server-side logic can then use for its reasoning.

When validating tokens coming down from client components of the application, I used code similar to the one shown below, inside of ConfigureServices method:

// Example of using Azure AD OpenID Connect bearer authentication.
services.AddAuthentication(sharedOptions =>
{
    sharedOptions.DefaultScheme = JwtBearerDefaults.AuthenticationScheme;
}).AddJwtBearer(options => 
{
    options.Authority = "https://login.microsoftonline.com/11111111-1111-1111-1111-111111111111";
    options.TokenValidationParameters = new TokenValidationParameters
    {
        ValidIssuer = "https://login.microsoftonline.com/11111111-1111-1111-1111-111111111111/v2.0",
        ValidAudiences = new string ["22222222-2222-2222-2222-222222222222"]
    };
});

where 11111111-1111-1111-1111-111111111111 is the tenant Id, and 22222222-2222-2222-2222-222222222222 is the client Id of application registration.

One of motivations for this post was the issue I kept getting with this authentication logic. I originally also had  an extra property setting on TokenValidationParameters object:

IssuerSigningKey = new X509SecurityKey(cert)

The above line assigns a public key encoded as X.509 certificate chain to be used later to decode a signature applied to the token by the Azure AD. Check out the always-excellent insights from Andrew Connell, where he explains the need for the key-based signature checks when validating tokens and a mechanism to obtain the public key (the "cert" in the line of code above).

My logic was however failing with the error IDX10511 "Signature validation failed. Keys tried...", and my research into nuances of RSA algorithm implementation in ASP.NET and JSON Web Token encoding was fruitless until I have found this thread on GitHub, and this related thread.

It turned out that my signature validation was fine, although the line above was not needed, because the library I rely on for token validation, the Microsoft.IdentityModel.Tokens, takes care of it automatically by making a call to obtain the Azure  JSON Web Key Set, and deserializing response to .NET public keys used for signature checking.

The actual wrong part had to do with my usage of access tokens: an access token obtained for a Microsoft Graph API resource happens to fail signature validation when used against a different resource (ASP.NET custom Web API in my case). This fact and that ASP.NET error message here  could be improved is covered in detail in the above GitHub threads.

What I had originally, which I refer to as "naive" configuration, is shown on figure below.


On this image, I have an Azure app registration for my web application, requesting some graph permission scopes. Then during execution I acquire token on the client (1), use it when sending requests to Graph API (2), but fail to do the same against my ASP.NET Web API (3), which results in IDX10511 error.

What is interesting here, is that:
  1. This setup kind of makes sense: I have an app, it is registered, and it wants to use access token it gets from Azure to let its own API "know" that a user has logged in.
  2. The problem can be fixed by sending an ID token instead of access token in step (3). OpenID Connect protocol grants ID token upon login, which signifies authentication event, while access token signifies authorization event. ID token's signature is validated without errors, and ASP.NET creates a claims identity for the signed in user.
What is not good about this design, is that the ID token is not meant to be used in this way. While one can choose to deviate from protocol's concept, it is not wise to do so without a compelling reason, since all tooling and third party libraries won't do the same.

Specifically, here are the problems I could identify with the above design:
  1. OpenID Connect, and OAuth 2.0 by extension use different grant flows depending on types of clients used. For a web browser, it is Implicit Grant, then for a server-side client it is one of other flows, depending on a scenario. We are in essence trying to use a token issued to one audience, when calling another audience. In my example, the Angular SPA and Web API are on the same domain. If they were hosted on different domains, this issue would have been more obvious.
  2. Microsoft uses an OAuth 2.0 extension, and on-behalf-of flow (aka OBO flow), which will be useful in scenarios when we decide to have our ASP.NET Web API enhanced by having it also access Graph API or another Microsoft cloud API. The current setup is not going to work with the OBO flow. 

The figure below shows an improved design:


This time we treat server-side Web API as a separate application as far as Azure AD is concerned. We do have to make our SPA application acquire access token twice as shown in calls (1) and (3), doing it so for each audience: once for Graph, and second time - for our own API. Then both calls to Graph (2) and to our own API (4) succeed.

Also, this design is fitting well with the OAuth paradigm. In fact, by the time we decide to augment our Web API and start making on-behalf-of calls from within it, we have already implemented its "first leg".

Lastly, a couple notes about the MSAL Angular configuration. Here is mine:


    MsalModule.forRoot({
      clientID: environment.azureRegistration.clientId,
      authority: environment.azureRegistration.authority,
      validateAuthority: true,
      redirectUri: environment.azureRegistration.redirectUrl,
      cacheLocation: 'localStorage',
      postLogoutRedirectUri: environment.azureRegistration.postLogoutRedirectUrl,
      navigateToLoginRequestUrl: true,
      popUp: false,
//      consentScopes: GRAPH_SCOPES,
      unprotectedResources: ['https://www.microsoft.com/en-us/'],
      protectedResourceMap: PROTECTED_RESOURCE_MAP,
      logger: loggerCallback,
      correlationId: '1234',
      level: LogLevel.Verbose,
      piiLoggingEnabled: true
    }),

MSAL will automatically acquire access token right after an id token is acquired after calling MsalService.loginPopup() with no scopes passed in as arguments. Commenting out or removing the consentScopes config option results in MSAL defaulting to using apps's client Id as an audience and returning a somewhat useless access token with no scopes in it.

I did this as I wanted to explicitly request separate access tokens for Graph and for my Web API. The way to do it is through passing scopes corresponding to an application to a call to MsalService.acquireTokenSilent(scopes). I am now thinking of changing it to pass the scopes of the Graph API  initially, so that my first access token is useful. For the second one I have no choice but to call the  MsalService.acquireTokenSilent(myWebApi_AppScopes)again.