NOTE: This original May 4th article was updated June 25.
MSIX modern application packages use a series of identifiers that can be very confusing. Packages have Names, Versions, Families, and a PackageFamilyName that incudes that odd set of characters you sometimes see that is somewhat tied to the code signing certificate used to sign the package.
Until recently, I had been able to pretty much ignore those odd characters, but in working with building new tooling to support the upcoming new SharedPackageContainer feature I was suddenly tasked with trying to figure this thing out so that I could generate them. Here’s what I learned.
The Package “Name” is stored in the Identity Element of the AppXManifest File. The Package Name is is a string of 3-50 characters and has limitation on the allowed character set.
The intent is that this name represents the idea of an application that may have different implementations, but only one of these implementations may be present on a system. For example, I can have two packages use the same Name but have different Versions. Or I can have packages with the same name for different Architectures (x86, x64, arm, armx64). Each of these packages would have their own MSIX file, however it would be possible to combine the files of different architectures into a larger Bundle file that could be installed and he OS would decide the appropriate MSIX file to use.
As I’ll explain later, the intent may have been that only one package with the same name is installed on a given system, but there are exceptions.
There is also a package “DisplayName” field in the Manifest. This is not associated with identity but used instead in some tooling when displaying the package Name to a user in a more friendly form.
The Version field is a “quad-tuple”, a string made up of four unsigned integers separated by the period character, as in “22.214.171.124”. The fourth field is reserved for use by Microsoft when the package is distributed through the Microsoft Store, but for non-store distribution this field may be used as desired. There is the concept of newer (higher) and older (lower) versions and much of the tooling uses this to decide whether/when to upgrade.
The “Publisher” field is a string that usually starts with “CN=…”. This field is also stored in the AppXManifest identity element. The intent of this element is that it provides a linkage between the contents of the package and the identity of the entity that signs the package. Linkage here does not necessarily imply that this field identifies the publisher, however, the value of this string must match the Subject name of the certificate used to perform code-signing of the package.
Anyone can create a code-signing certificate with whatever Subject field they want, and use that in the Publisher field of the AppXManifest and successfully create a signed package. I could even use the same string that Microsoft uses. That package won’t install unless I can trick you to install my private certificate, but the point here is that the Publisher field is not really an identity of anything, just a linkage to the certificate.
Publisher Display Name
Like the package Display Name field, this is used simply for display purposes in some tooling to provide a friendlier name. You’d rather see “TMurgent Technologies, LLP” show up rather than “CN=TMurgent Technologies, LLP…”.
Package Family Name
The “Package Family Name” is a field that consists of the package Name field, followed by an underscore character, followed by a set of random looking characters. This field is used as folder name for where the package is installed to, but appears in Microsoft tooling sometimes. This field is used generally for display and for a more unique identity for the installed package.
I said earlier that the intent appears to have been that only one package with a given Package Name may be installed, but the actual system limitation is that the Package Family Name is used for this uniqueness requirement.
So what are these semi-random looking set of characters and what do we call them?
As far as I can find, there is no official name for it, but the string is the same for any package signed by the same certificate. Microsoft describes it as a hash of the Publisher field, an overly simplified and possibly misleading description.
To create the XML file for the new SharedPackageContainer feature, I need to be able to generate the Package Family Name for packages that are not installed anywhere. The string itself does not exist inside the package itself anywhere, but is dynamically generated when a package is installed. I don’t want to install just to find out what the string is, so I looked into the open source code Microsoft has for PackageManager to see how they generate it. It uses a hash, but is not a hash. It is, eh, an obscurity. And probably not a good one. I’m actually guessing it may be an adaption of earlier things that were hashes and help with uniqueness, but this isn’t.
Here is an English summary of that code. It takes the Publisher field from the AppXManifest and transforms it into a Utf16 string (also known as a Unicode string or that strange multi-byte format of a string where you see each ASCII character separated by another zero byte character). It then creates a SHA256 hash of that field.
The SHA256 hash is a 256-bit (32 byte) hash of the Utf16 string. Like any hash, it doesn’t create a 100% unique hash for every possible input string, but it is close enough that we aren’t going to worry about it. At least not yet. After all SHA2 was good enough for a long time.
But then the code takes only 40-bits (8 bytes) of that hash and performs a Encode32 operation on it, turning it into a 13 byte value. This is a simple formatting exercise of the bits and is completely reversible, and we usually think about such encodings to allow binary data to be treated as characters. Those are the 13 characters that we see.
Using only a quarter of the generated hash completely invalidates the uniqueness and probably opens the door for someone to generate a publisher field resulting in the same 13-byte string. This doesn’t compromise security as you still need for the associated code-signing certificate to be installed as a trusted certificate, but it certainly seems like an odd compromise to keep the size of the Package Family Name field down; most likely just to reduce the size of the folder name.
I simply fail to see the wisdom of the package family name. It does not add to the identity of the package, and causes additional issues. For example, if someone wants to provide an upgraded package by using the same package name and higher version string, as long as they use the same certificate the system will recognize the upgrade scenario as the Package Family Names match.
But should I use a different certificate that has an altered Publisher Name it will be treated as a separate package and installed in parallel. This could happen due to company mergers. Or if someone gets the app from the Microsoft Store (where it is signed by Microsoft on my behalf using a different Publisher string) and wants to upgrade using a download directly from my website where I sign it with my certificate purchased through a CA. Or when my code-signing certificate expires later this year (they are not renewable) and I have to buy a new one. In my case, the CA has since been acquired by a different CA and I can’t be sure that I will end up with an identical Subject field in the new cert.
But this is what we have to work with. Hopefully my investigation will help you with whatever you are facing that caused you to find this post!
UPDATE: June 25, 2021
Well my old cert has now expired and I don’t have a good replacement because the new CA has failed yet again to produce a new cert with a matching subject field. Without it, I can’t release new MSIX packages and have my customers get that smooth upgrade experience; they’d have to uninstall, install, and loose their settings.
I might be the first to run into this, but a surely won’t be the last!