From c22a7f582b4e0f42fa05313e10d5a54028dc3a44 Mon Sep 17 00:00:00 2001 From: Kristian Rickert Date: Tue, 16 Jun 2026 07:32:14 -0400 Subject: [PATCH 01/11] OPENNLP-1846 - Recognize all entity types in NameFinderDL and harden decoding NameFinderDL only decoded B-PER/I-PER and put the matched text in Span.getType() instead of the entity label. Decode the BIO sequence generically and harden it: - Any B- begins a span whose type is the label minus the B- prefix (B-ORG -> ORG), extending while the following labels are I-. Span.getType() now reports the entity label (PER, ORG, LOC, ...) and ids2Labels fully drives recognition for any BIO-tagged model. - isBeginLabel() requires a non-empty type after "B-", so a malformed "B-" label no longer starts an empty-type span. An argmax index with no entry in ids2Labels fails loudly instead of being silently skipped. - Span.getProb() is now a numerically stable softmax over the token's label scores (bounded to [0,1]) instead of the raw max logit; handles +Inf, all-(-Inf) and NaN edge cases. - find() inference is fail-loud and consistent with the sibling DocumentCategorizerDL: failures surface as IllegalStateException (cause preserved) and an unexpected/empty model-output shape is its own loud failure, rather than a bare RuntimeException or raw ClassCastException. - Floor the character-search cursor at each sentence's start (via sentPosDetect) and thread it forward across that sentence's chunks, so a repeated entity surface form is located at its own occurrence instead of being re-matched against an earlier one -- which previously emitted duplicate or mis-located spans for multi-sentence/multi-chunk input. - Span text reconstruction matches the source with flexible whitespace (\s*), so entities whose wordpiece tokenization splits internal punctuation or "&" apart (U.S.A, AT&T) are still located instead of silently dropped. - Remove the now-unused SpanEnd record. - Extract decodeSpans()/predictLabel()/findEntityEnd()/buildSpanText() and expose labelProbability()/maxIndex() for unit testing without an ONNX model; add NameFinderDLTest coverage for entity types, bounded and edge-case probabilities, malformed begin labels, wordpiece reconstruction, internal-punctuation and case-insensitive matching, missing labels, and cursor-threaded span location. - Reconcile the OPENNLP-1844 concurrency/snapshot eval tests with the new all-types output (the George-Washington input now yields PER + LOC) and assert span types and covered text. --- opennlp-core/opennlp-ml/opennlp-dl/README.md | 16 +- .../src/main/java/opennlp/dl/SpanEnd.java | 27 -- .../opennlp/dl/namefinder/NameFinderDL.java | 402 +++++++++++------- .../dl/namefinder/NameFinderDLTest.java | 209 +++++++++ .../dl/namefinder/NameFinderDLEval.java | 60 ++- 5 files changed, 523 insertions(+), 191 deletions(-) delete mode 100644 opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/SpanEnd.java diff --git a/opennlp-core/opennlp-ml/opennlp-dl/README.md b/opennlp-core/opennlp-ml/opennlp-dl/README.md index 912cd983d..04a7715d4 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/README.md +++ b/opennlp-core/opennlp-ml/opennlp-dl/README.md @@ -8,7 +8,21 @@ Models used in the tests are available in the [opennlp evaluation test data](htt ## NameFinderDL -Export a Huggingface NER model to ONNX, e.g.: +`NameFinderDL` runs ONNX token-classification models that use BIO labels. Any +label in the form `B-` starts an entity and subsequent `I-` labels +continue that entity. The text after the prefix is reported as the OpenNLP span +type, for example `B-PER` and `I-PER` produce spans with type `PER`. + +The finder uses BERT basic tokenization followed by WordPiece tokenization and +then maps the reconstructed WordPiece text back to the caller's original input +so returned spans can be used with `Span#getCoveredText(...)`. Span probabilities +are normalized from the model logits and are reported in the range `(0, 1]`. + +Named entity models are commonly cased, so lower casing is disabled by default. +Set `InferenceOptions#setLowerCase(true)` only for models trained with uncased +input. + +Export a Hugging Face NER model to ONNX, e.g.: ```bash python -m transformers.onnx --model=dslim/bert-base-NER --feature token-classification exported diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/SpanEnd.java b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/SpanEnd.java deleted file mode 100644 index 2c91c1928..000000000 --- a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/SpanEnd.java +++ /dev/null @@ -1,27 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -package opennlp.dl; - -public record SpanEnd(int index, int characterEnd) { - - @Override - public String toString() { - return "index: " + index + "; character end: " + characterEnd; - } - -} diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java index 3445969e8..ea5656ce5 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java +++ b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java @@ -20,6 +20,7 @@ import java.io.File; import java.io.IOException; import java.nio.LongBuffer; +import java.util.ArrayList; import java.util.Arrays; import java.util.HashMap; import java.util.LinkedList; @@ -35,7 +36,6 @@ import opennlp.dl.AbstractDL; import opennlp.dl.InferenceOptions; -import opennlp.dl.SpanEnd; import opennlp.dl.Tokens; import opennlp.tools.commons.ThreadSafe; import opennlp.tools.namefind.TokenNameFinder; @@ -67,9 +67,17 @@ @ThreadSafe public class NameFinderDL extends AbstractDL implements TokenNameFinder { + /** Example person labels; retained for reference. Decoding handles any B-/I- type. */ public static final String I_PER = "I-PER"; public static final String B_PER = "B-PER"; public static final String SEPARATOR = "[SEP]"; + private static final String CLS_TOKEN = "[CLS]"; + + /** Prefix used by BIO labels for the first token in an entity span. */ + private static final String BEGIN_PREFIX = "B-"; + + /** Prefix used by BIO labels for continuation tokens in an entity span. */ + private static final String INSIDE_PREFIX = "I-"; /** NER models are commonly cased, so lower casing is off by default. */ private static final boolean LOWER_CASE_DEFAULT = false; @@ -144,246 +152,334 @@ private static InferenceOptions validateConstructorArguments( @Override public Span[] find(String[] input) { - final List spans = new LinkedList<>(); + final List spans = new ArrayList<>(); // Join the tokens here because they will be tokenized using Wordpiece during inference. final String text = String.join(" ", input); - final String[] sentences = sentenceDetector.sentDetect(text); + // sentPosDetect (not sentDetect) so each sentence's offset in the full text is known. + final Span[] sentenceSpans = sentenceDetector.sentPosDetect(text); + + for (final Span sentenceSpan : sentenceSpans) { - for (String sentence : sentences) { + // Floor the character cursor at this sentence's start, then thread it forward across the + // sentence's chunks so a repeated surface form is located at its next occurrence. Flooring + // per sentence keeps an entity from being matched against an identical surface form in an + // earlier sentence -- even one that produced no spans, which would otherwise leave the + // cursor behind and mis-locate the match. + int searchStart = sentenceSpan.getStart(); // The WordPiece tokenized text. This changes the spacing in the text. - final List wordpieceTokens = tokenize(sentence); + final List wordpieceTokens = tokenize(sentenceSpan.getCoveredText(text).toString()); for (final Tokens tokens : wordpieceTokens) { + final List decoded = + decodeSpans(text, tokens.tokens(), infer(tokens), ids2Labels, searchStart); + spans.addAll(decoded); + if (!decoded.isEmpty()) { + searchStart = decoded.get(decoded.size() - 1).getEnd(); + } + } - try { - - // The inputs to the ONNX model. - final Map inputs = new HashMap<>(); - - final float[][][] v; - try { - inputs.put(INPUT_IDS, OnnxTensor.createTensor(env, LongBuffer.wrap(tokens.ids()), - new long[] {1, tokens.ids().length})); - - if (includeAttentionMask) { - inputs.put(ATTENTION_MASK, OnnxTensor.createTensor(env, - LongBuffer.wrap(tokens.mask()), new long[] {1, tokens.mask().length})); - } - - if (includeTokenTypeIds) { - inputs.put(TOKEN_TYPE_IDS, OnnxTensor.createTensor(env, - LongBuffer.wrap(tokens.types()), new long[] {1, tokens.types().length})); - } - - // The outputs from the model. - try (OrtSession.Result result = session.run(inputs)) { - // getValue() copies the tensor into Java arrays, so the result can be closed safely. - v = (float[][][]) result.get(0).getValue(); - } - } finally { - inputs.values().forEach(OnnxTensor::close); - } - - // Find consecutive B-PER and I-PER labels and combine the spans where necessary. - // There are also B-LOC and I-LOC tags for locations that might be useful at some point. + } - // Keep track of where the last span was so when there are multiple/duplicate - // spans we can get the next one instead of the first one each time. - int characterStart = 0; + return spans.toArray(new Span[0]); - final String[] toks = tokens.tokens(); + } - // We are looping over the vector for each word, - // finding the index of the array that has the maximum value, - // and then finding the token classification that corresponds to that index. - for (int x = 0; x < v[0].length; x++) { + /** + * Runs the model on one token window and returns the per-token label score rows. A failure + * executing the model (an {@link OrtException} or any runtime fault) is surfaced as an + * {@link IllegalStateException} (cause preserved); an unexpected output shape is its own loud + * failure. This mirrors the fail-loud contract of the sibling {@code DocumentCategorizerDL}. + * + * @param tokens The tokens for one chunk to run inference on. + * @return The {@code [token][label]} score matrix for the chunk. + */ + private float[][] infer(final Tokens tokens) { - final float[] arr = v[0][x]; - final int maxIndex = maxIndex(arr); - final String label = ids2Labels.get(maxIndex); + final Map inputs = new HashMap<>(); + final Object output; + try { + inputs.put(INPUT_IDS, OnnxTensor.createTensor(env, LongBuffer.wrap(tokens.ids()), + new long[] {1, tokens.ids().length})); - // TODO: Need to make sure this value is between 0 and 1? - // Can we do thresholding without it between 0 and 1? - final double confidence = arr[maxIndex]; // / 10; + if (includeAttentionMask) { + inputs.put(ATTENTION_MASK, OnnxTensor.createTensor(env, + LongBuffer.wrap(tokens.mask()), new long[] {1, tokens.mask().length})); + } - // Is this is the start of a person entity. - if (B_PER.equals(label)) { + if (includeTokenTypeIds) { + inputs.put(TOKEN_TYPE_IDS, OnnxTensor.createTensor(env, + LongBuffer.wrap(tokens.types()), new long[] {1, tokens.types().length})); + } - String spanText; + // getValue() copies the tensor into Java arrays, so the result can be closed safely. + try (OrtSession.Result result = session.run(inputs)) { + output = result.get(0).getValue(); + } + } catch (OrtException | RuntimeException ex) { + throw new IllegalStateException("Unable to perform name finder inference", ex); + } finally { + inputs.values().forEach(OnnxTensor::close); + } - // Find the end index of the span in the array (where the label is not I-PER). - final SpanEnd spanEnd = findSpanEnd(v, x, ids2Labels, toks); + // The model returns one score row per token, batched: float[batch][token][label]. Any other + // shape (or an empty batch) is a model-contract violation, surfaced on its own rather than as + // "inference failed". + if (output instanceof float[][][] v && v.length > 0) { + return v[0]; + } + throw new IllegalStateException("Unexpected model output type: " + + (output == null ? "null" : output.getClass().getName())); + } - // If the end is -1 it means this is a single-span token. - // If the end is != -1 it means this is a multi-span token. - if (spanEnd.index() != -1) { + @Override + public void clearAdaptiveData() { + // No use in this implementation. + } - final StringBuilder sb = new StringBuilder(); + /** + * Decodes spans beginning the character search at the start of {@code text}. Equivalent to + * {@link #decodeSpans(String, String[], float[][], Map, int)} with {@code searchStart == 0}. + * + * @param text The original text passed to the model. + * @param tokens The WordPiece tokens produced for the text. + * @param tokenLabelScores The per-token label scores returned by the model. + * @param id2Labels The mapping from model output indexes to BIO labels. + * @return The decoded spans. + */ + static List decodeSpans(String text, String[] tokens, float[][] tokenLabelScores, + Map id2Labels) { + return decodeSpans(text, tokens, tokenLabelScores, id2Labels, 0); + } - // We have to concatenate the tokens. - // Add each token in the array and separate them with a space. - // We'll separate each with a single space because later we'll find the original span - // in the text and ignore spacing between individual tokens in findByRegex(). - int end = spanEnd.index(); - for (int i = x; i <= end; i++) { + /** + * Converts model token classifications into character spans in the original input text. + * + *

The ONNX model returns one score vector for each WordPiece token. This method applies + * BIO decoding, reconstructs WordPiece fragments, and then resolves the reconstructed text + * against the original sentence so that {@link Span#getCoveredText(CharSequence)} works with + * the caller's input.

+ * + * @param text The original text passed to the model. + * @param tokens The WordPiece tokens produced for the text. + * @param tokenLabelScores The per-token label scores returned by the model. + * @param id2Labels The mapping from model output indexes to BIO labels. + * @param searchStart The character offset in {@code text} to begin locating spans from. Threading + * a monotonic cursor across the chunks and sentences of a single {@link #find(String[])} call + * keeps a repeated entity surface form from being emitted twice at the same first occurrence. + * @return The decoded spans. + */ + static List decodeSpans(String text, String[] tokens, float[][] tokenLabelScores, + Map id2Labels, int searchStart) { - // If the next token starts with ##, combine it with this token. - if (toks[i + 1].startsWith(CHARS_TO_REPLACE)) { + if (tokens.length != tokenLabelScores.length) { + throw new IllegalArgumentException("The number of tokens (" + tokens.length + + ") must match the number of model output rows (" + tokenLabelScores.length + ")."); + } - sb.append(toks[i]).append(toks[i + 1].replace(CHARS_TO_REPLACE, "")); + final List spans = new ArrayList<>(); - // Append a space unless the next (next) token starts with ##. - if (!toks[i + 2].startsWith(CHARS_TO_REPLACE)) { - sb.append(" "); - } + int characterStart = searchStart; - // Skip the next token since we just included it in this iteration. - i++; + for (int x = 0; x < tokenLabelScores.length; x++) { + final LabelPrediction prediction = predictLabel(tokenLabelScores[x], id2Labels); + if (!isBeginLabel(prediction.label())) { + continue; + } - } else { + final String entityType = prediction.label().substring(BEGIN_PREFIX.length()); + final EntityPrediction entity = findEntityEnd(tokenLabelScores, x, id2Labels, + entityType, prediction.probability()); + final String spanText = buildSpanText(tokens, x, entity.endIndex()); - sb.append(toks[i].replace(CHARS_TO_REPLACE, "")); + if (spanText.isBlank()) { + x = entity.endIndex(); + continue; + } - // Append a space unless the next token is a period. - if (!".".equals(toks[i + 1])) { - sb.append(" "); - } + final SpanMatch match = findByRegex(text, spanText, characterStart); + if (match.start() != -1) { + spans.add(new Span(match.start(), match.end(), entityType, entity.probability())); + characterStart = match.end(); + } - } + x = entity.endIndex(); + } - } + return spans; - // This is the text of the span. We use the whole original input text and not one - // of the splits. This gives us accurate character positions. - spanText = findByRegex(text, sb.toString().trim()).trim(); + } - } else { + private static EntityPrediction findEntityEnd(float[][] tokenLabelScores, int startIndex, + Map id2Labels, + String entityType, + double startProbability) { - // This is a single-token span so there is nothing else to do except grab the token. - spanText = toks[x]; + final String insideLabel = INSIDE_PREFIX + entityType; + int endIndex = startIndex; + double probability = startProbability; - } + for (int x = startIndex + 1; x < tokenLabelScores.length; x++) { + final LabelPrediction prediction = predictLabel(tokenLabelScores[x], id2Labels); + if (!insideLabel.equals(prediction.label())) { + break; + } + endIndex = x; + probability = Math.min(probability, prediction.probability()); + } - if (!SEPARATOR.equals(spanText)) { + return new EntityPrediction(endIndex, probability); - spanText = spanText.replace(CHARS_TO_REPLACE, ""); + } - // This ignores other potential matches in the same sentence - // by only taking the first occurrence. - characterStart = text.indexOf(spanText, characterStart); + private static boolean isBeginLabel(String label) { + return label.startsWith(BEGIN_PREFIX) && label.length() > BEGIN_PREFIX.length(); + } - // TODO: This check should not be needed because the span was found. - // If we aren't finding it now it's because there's a whitespace difference. - if (characterStart != -1) { + private static LabelPrediction predictLabel(float[] scores, Map id2Labels) { - final int characterEnd = characterStart + spanText.length(); + final int labelIndex = maxIndex(scores); + final String label = id2Labels.get(labelIndex); + if (label == null) { + throw new IllegalArgumentException("No label is configured for model output index " + + labelIndex + "."); + } - spans.add(new Span(characterStart, characterEnd, spanText, confidence)); + return new LabelPrediction(label, labelProbability(scores, labelIndex)); - // OP-1: Only increment characterStart by one. - characterStart++; + } - } + static double labelProbability(float[] scores, int labelIndex) { - } + int positiveInfinityCount = 0; + double max = Float.NEGATIVE_INFINITY; - } + for (float score : scores) { + if (score == Float.POSITIVE_INFINITY) { + positiveInfinityCount++; + } else if (!Float.isNaN(score) && score > max) { + max = score; + } + } - } + if (positiveInfinityCount > 0) { + // From decodeSpans, labelIndex is always the argmax, so when any +Inf is present the chosen + // score is +Inf and this returns 1/(number of +Inf). The 0d arm covers a direct caller + // asking for a non-+Inf label's probability while a +Inf label exists (exercised by tests). + return scores[labelIndex] == Float.POSITIVE_INFINITY ? 1d / positiveInfinityCount : 0d; + } - } catch (OrtException ex) { - throw new RuntimeException("Error performing namefinder inference: " + ex.getMessage(), ex); - } + if (max == Float.NEGATIVE_INFINITY) { + return 1d / scores.length; + } + double denominator = 0; + for (float score : scores) { + if (!Float.isNaN(score)) { + denominator += Math.exp(score - max); } - } - return spans.toArray(new Span[0]); - - } + return Math.exp(scores[labelIndex] - max) / denominator; - @Override - public void clearAdaptiveData() { - // No use in this implementation. } - private SpanEnd findSpanEnd(float[][][] v, int startIndex, Map id2Labels, - String[] tokens) { + static String buildSpanText(String[] tokens, int startIndex, int endIndex) { - // -1 means there is no follow-up token, so it is a single-token span. - int index = -1; - int characterEnd = 0; - - // Starts at the span start in the vector. - // Looks at the next token to see if it is an I-PER. - // Go until the next token is something other than I-PER. - // When the next token is not I-PER, return the previous index. - - for (int x = startIndex + 1; x < v[0].length; x++) { + final StringBuilder span = new StringBuilder(); + String previousToken = null; - // Get the next item. - final float[] arr = v[0][x]; - - // See if the next token has an I-PER label. - final String nextTokenClassification = id2Labels.get(maxIndex(arr)); + for (int x = startIndex; x <= endIndex && x < tokens.length; x++) { + final String token = tokens[x]; + if (CLS_TOKEN.equals(token) || SEPARATOR.equals(token)) { + continue; + } - if (!I_PER.equals(nextTokenClassification)) { - index = x - 1; - break; + final boolean subword = token.startsWith(CHARS_TO_REPLACE); + final String surface = subword ? token.substring(CHARS_TO_REPLACE.length()) : token; + if (surface.isEmpty()) { + continue; } + if (span.length() > 0 && !subword && shouldInsertSpace(previousToken, surface)) { + span.append(' '); + } + span.append(surface); + previousToken = surface; } - // Find where the span ends based on the tokens. - for (int x = 1; x <= index && x < tokens.length; x++) { - characterEnd += tokens[x].length(); - } + return span.toString(); + + } - // Account for the number of spaces (that is the number of tokens). - // (One space per token.) - characterEnd += index - 1; + private static boolean shouldInsertSpace(String previousToken, String token) { + return previousToken != null && !hasNoSpaceBefore(token) && !hasNoSpaceAfter(previousToken); + } - return new SpanEnd(index, characterEnd); + private static boolean hasNoSpaceBefore(String token) { + return switch (token) { + case ".", ",", ":", ";", "!", "?", ")", "]", "}", "%", "'", "-", "/" -> true; + default -> false; + }; + } + private static boolean hasNoSpaceAfter(String token) { + return switch (token) { + case "(", "[", "{", "$", "'", "-", "/" -> true; + default -> false; + }; } - private int maxIndex(float[] arr) { + static int maxIndex(float[] arr) { double max = Float.NEGATIVE_INFINITY; int index = -1; for (int x = 0; x < arr.length; x++) { - if (arr[x] > max) { + if (!Float.isNaN(arr[x]) && (index == -1 || arr[x] > max)) { index = x; max = arr[x]; } } + if (index == -1) { + throw new IllegalArgumentException( + "Model output scores must contain at least one non-NaN value."); + } + return index; } - private static String findByRegex(String text, String span) { + private static SpanMatch findByRegex(String text, String span, int searchStart) { - final String regex = span - .replaceAll(" ", "\\\\s+") - .replaceAll("\\)", "\\\\)") - .replaceAll("\\(", "\\\\("); + // Reconstructed span text normalizes whitespace, so match flexibly: a space in the span may + // map to any run of whitespace OR none in the source (e.g. punctuation/'&' inside "U.S.A", + // "AT&T" that wordpiece tokenization split apart). Use \s* rather than \s+ so such entities + // are still located instead of being silently dropped. + final String regex = Pattern.quote(span).replace(" ", "\\E\\s*\\Q"); final Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE); final Matcher matcher = pattern.matcher(text); + matcher.region(Math.min(Math.max(searchStart, 0), text.length()), text.length()); if (matcher.find()) { - return matcher.group(0); + return new SpanMatch(matcher.start(), matcher.end()); } - // For some reason the regex match wasn't found. Just return the original span. - return span; + return new SpanMatch(-1, -1); + + } + + private record LabelPrediction(String label, double probability) { + } + + private record EntityPrediction(int endIndex, double probability) { + } + private record SpanMatch(int start, int end) { } private List tokenize(final String text) { diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/namefinder/NameFinderDLTest.java b/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/namefinder/NameFinderDLTest.java index 87fe18c9b..e342634a6 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/namefinder/NameFinderDLTest.java +++ b/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/namefinder/NameFinderDLTest.java @@ -18,18 +18,31 @@ package opennlp.dl.namefinder; import java.util.HashMap; +import java.util.List; import java.util.Map; import org.junit.jupiter.api.Test; import opennlp.tools.tokenize.WordpieceTokenizer; +import opennlp.tools.util.Span; import static org.junit.jupiter.api.Assertions.assertArrayEquals; +import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertThrows; import static org.junit.jupiter.api.Assertions.assertTrue; public class NameFinderDLTest { + private static final Map ID_TO_LABELS = Map.of( + 0, "O", + 1, "B-PER", + 2, "I-PER", + 3, "B-LOC", + 4, "I-LOC", + 5, "B-ORG", + 6, "I-ORG", + 7, "B-"); + private static Map vocab() { final Map vocab = new HashMap<>(); vocab.put(WordpieceTokenizer.BERT_CLS_TOKEN, 0); @@ -57,4 +70,200 @@ void testTokenIdsRejectsTokensMissingFromVocabulary() { assertTrue(e.getMessage().contains("missing"), "the error message should name the missing token: " + e.getMessage()); } + + @Test + void testDecodeSpansUsesBioEntityTypesAndBoundedProbabilities() { + final String text = "Alice visited New York City."; + final String[] tokens = {"[CLS]", "Alice", "visited", "New", "York", "City", ".", "[SEP]"}; + final float[][] scores = { + scoresFor(0), scoresFor(1), scoresFor(0), scoresFor(3), scoresFor(4), scoresFor(4), + scoresFor(0), scoresFor(0) + }; + + final List spans = NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS); + + assertEquals(2, spans.size()); + + final Span person = spans.get(0); + assertEquals("PER", person.getType()); + assertEquals("Alice", person.getCoveredText(text)); + assertProbability(person); + + final Span location = spans.get(1); + assertEquals("LOC", location.getType()); + assertEquals("New York City", location.getCoveredText(text)); + assertProbability(location); + } + + @Test + void testDecodeSpansReconstructsWordpiecesAndEscapedPunctuation() { + final String text = "Acme (UK) hired Sarah Connor."; + final String[] tokens = {"[CLS]", "Acme", "(", "UK", ")", "hired", "Sarah", "Con", + "##nor", ".", "[SEP]"}; + final float[][] scores = { + scoresFor(0), scoresFor(5), scoresFor(6), scoresFor(6), scoresFor(6), scoresFor(0), + scoresFor(1), scoresFor(2), scoresFor(2), scoresFor(0), scoresFor(0) + }; + + final List spans = NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS); + + assertEquals(2, spans.size()); + assertEquals("ORG", spans.get(0).getType()); + assertEquals("Acme (UK)", spans.get(0).getCoveredText(text)); + assertEquals("PER", spans.get(1).getType()); + assertEquals("Sarah Connor", spans.get(1).getCoveredText(text)); + } + + @Test + void testDecodeSpansIgnoresMalformedBeginLabels() { + final String text = "Alice visited."; + final String[] tokens = {"[CLS]", "Alice", "visited", ".", "[SEP]"}; + final float[][] scores = { + scoresFor(0), scoresFor(7), scoresFor(0), scoresFor(0), scoresFor(0) + }; + + final List spans = NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS); + + assertTrue(spans.isEmpty()); + } + + @Test + void testDecodeSpansRejectsMissingPredictedLabels() { + final String text = "Alice visited."; + final String[] tokens = {"[CLS]", "Alice", "visited", ".", "[SEP]"}; + final float[][] scores = { + scoresFor(0), scoresFor(1), scoresFor(0), scoresFor(0), scoresFor(0) + }; + final Map incompleteLabels = Map.of(0, "O"); + + final IllegalArgumentException e = assertThrows(IllegalArgumentException.class, () -> + NameFinderDL.decodeSpans(text, tokens, scores, incompleteLabels)); + + assertTrue(e.getMessage().contains("1"), + "the error message should name the missing label id: " + e.getMessage()); + } + + @Test + void testDecodeSpansSearchStartLocatesNextOccurrence() { + // "Paris" appears twice. Threading the cursor past the first occurrence (as find() does + // across chunks/sentences) locates the second one instead of re-emitting the first, so a + // repeated entity is not duplicated at the same offset. + final String text = "Paris and Paris"; + final String[] tokens = {"[CLS]", "Paris", "[SEP]"}; + final float[][] scores = {scoresFor(0), scoresFor(3), scoresFor(0)}; + + final List first = NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS, 0); + assertEquals(1, first.size()); + assertEquals(0, first.get(0).getStart()); + assertEquals(5, first.get(0).getEnd()); + + final List next = NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS, + first.get(0).getEnd()); + assertEquals(1, next.size()); + assertEquals(10, next.get(0).getStart()); + assertEquals(15, next.get(0).getEnd()); + assertEquals("Paris", next.get(0).getCoveredText(text)); + } + + @Test + void testDecodeSpansLocatesEntityWithInternalPunctuation() { + // WordPiece splits "AT&T" into separate AT / & / T tokens, so the reconstructed span text + // ("AT & T") must still be located in the contiguous source. Regression guard for the + // flexible-whitespace (\s*) matching in findByRegex. + final String text = "Buy AT&T stock"; + final String[] tokens = {"[CLS]", "Buy", "AT", "&", "T", "stock", "[SEP]"}; + final float[][] scores = { + scoresFor(0), scoresFor(0), scoresFor(5), scoresFor(6), scoresFor(6), + scoresFor(0), scoresFor(0) + }; + + final List spans = NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS); + + assertEquals(1, spans.size()); + assertEquals("ORG", spans.get(0).getType()); + assertEquals("AT&T", spans.get(0).getCoveredText(text)); + } + + @Test + void testDecodeSpansMatchesSourceCaseInsensitively() { + // The reconstructed span text may differ in case from the source (e.g. an uncased model); + // findByRegex matches case-insensitively, so the span is still located at the source offsets. + final String text = "Visit PARIS today"; + final String[] tokens = {"[CLS]", "Visit", "paris", "today", "[SEP]"}; + final float[][] scores = { + scoresFor(0), scoresFor(0), scoresFor(3), scoresFor(0), scoresFor(0) + }; + + final List spans = NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS); + + assertEquals(1, spans.size()); + assertEquals("LOC", spans.get(0).getType()); + assertEquals("PARIS", spans.get(0).getCoveredText(text)); + } + + @Test + void testMaxIndexSkipsNaNAndPicksLargestFinite() { + assertEquals(1, NameFinderDL.maxIndex(new float[] {Float.NaN, 5f, -5f})); + } + + @Test + void testMaxIndexRejectsAllNaNOrEmptyScores() { + assertThrows(IllegalArgumentException.class, + () -> NameFinderDL.maxIndex(new float[] {Float.NaN, Float.NaN})); + assertThrows(IllegalArgumentException.class, + () -> NameFinderDL.maxIndex(new float[0])); + } + + @Test + void testLabelProbabilityIsBoundedStableSoftmax() { + // Reference (numpy): softmax([1,2,3])[2] = 0.66524096. + final double p = NameFinderDL.labelProbability(new float[] {1f, 2f, 3f}, 2); + assertEquals(0.66524096, p, 1e-6); + assertBounded(p); + } + + @Test + void testLabelProbabilityHandlesPositiveInfinity() { + // Two +Inf logits split the mass; a finite logit alongside them gets zero. + final float[] scores = {Float.POSITIVE_INFINITY, 0f, Float.POSITIVE_INFINITY}; + assertEquals(0.5, NameFinderDL.labelProbability(scores, 0), 1e-9); + assertEquals(0.0, NameFinderDL.labelProbability(scores, 1), 1e-9); + assertBounded(NameFinderDL.labelProbability(scores, 0)); + } + + @Test + void testLabelProbabilityHandlesAllNegativeInfinity() { + // No finite score: fall back to a uniform distribution rather than producing NaN. + final double p = NameFinderDL.labelProbability( + new float[] {Float.NEGATIVE_INFINITY, Float.NEGATIVE_INFINITY}, 0); + assertEquals(0.5, p, 1e-9); + assertBounded(p); + } + + @Test + void testLabelProbabilityIgnoresNaNInDenominator() { + // A NaN logit must not poison the normalization of the finite ones. + final double p = NameFinderDL.labelProbability(new float[] {0f, Float.NaN, 0f}, 0); + assertEquals(0.5, p, 1e-9); + assertBounded(p); + } + + private static float[] scoresFor(int labelIndex) { + final float[] scores = new float[ID_TO_LABELS.size()]; + for (int i = 0; i < scores.length; i++) { + scores[i] = -5; + } + scores[labelIndex] = 5; + return scores; + } + + private static void assertProbability(Span span) { + assertTrue(span.getProb() > 0 && span.getProb() <= 1, + "span probability should be normalized to (0, 1]: " + span.getProb()); + } + + private static void assertBounded(double probability) { + assertTrue(probability >= 0 && probability <= 1, + "probability must be within [0, 1]: " + probability); + } } diff --git a/opennlp-eval-tests/src/test/java/opennlp/dl/namefinder/NameFinderDLEval.java b/opennlp-eval-tests/src/test/java/opennlp/dl/namefinder/NameFinderDLEval.java index 553c31590..e19742b96 100644 --- a/opennlp-eval-tests/src/test/java/opennlp/dl/namefinder/NameFinderDLEval.java +++ b/opennlp-eval-tests/src/test/java/opennlp/dl/namefinder/NameFinderDLEval.java @@ -69,12 +69,23 @@ public void tokenNameFinder1Test() throws Exception { logger.debug(span.toString()); } - Assertions.assertEquals(1, spans.length); + final String text = String.join(" ", tokens); + + // The model emits a PER and a LOC entity; the person-only decoder previously dropped + // the location. Span types are the entity labels (PER/LOC), not the matched text. + Assertions.assertEquals(2, spans.length); + + Assertions.assertEquals("PER", spans[0].getType()); Assertions.assertEquals(0, spans[0].getStart()); Assertions.assertEquals(17, spans[0].getEnd()); - Assertions.assertEquals(8.251646041870117, spans[0].getProb(), 0.00001); - Assertions.assertEquals("George Washington", - spans[0].getCoveredText(String.join(" ", tokens))); + Assertions.assertEquals("George Washington", spans[0].getCoveredText(text)); + Assertions.assertTrue(spans[0].getProb() > 0 && spans[0].getProb() <= 1); + + Assertions.assertEquals("LOC", spans[1].getType()); + Assertions.assertEquals(39, spans[1].getStart()); + Assertions.assertEquals(52, spans[1].getEnd()); + Assertions.assertEquals("United States", spans[1].getCoveredText(text)); + Assertions.assertTrue(spans[1].getProb() > 0 && spans[1].getProb() <= 1); } } @@ -113,10 +124,16 @@ public void tokenNameFinderConcurrentTest() throws Exception { startGate.await(); for (int i = 0; i < iterationsPerThread; i++) { final Span[] spans = nameFinderDL.find(tokens); - if (spans.length != 1 + // The all-entity decoder yields both the PER and the LOC span for this input. + if (spans.length != 2 || spans[0].getStart() != 0 || spans[0].getEnd() != 17 - || !"George Washington".equals(spans[0].getCoveredText(text))) { + || !"PER".equals(spans[0].getType()) + || !"George Washington".equals(spans[0].getCoveredText(text)) + || spans[1].getStart() != 39 + || spans[1].getEnd() != 52 + || !"LOC".equals(spans[1].getType()) + || !"United States".equals(spans[1].getCoveredText(text))) { return false; } } @@ -151,6 +168,7 @@ public void nameFinderDlConcurrentWithSentenceDetectorMe() throws Exception { final String[] tokens = new String[] {"George", "Washington", "was", "president", "of", "the", "United", "States", "."}; + final String text = String.join(" ", tokens); // Explicitly construct the detector inside the test to make the precondition visible. final SentenceDetectorME detector = new SentenceDetectorME("en"); @@ -171,9 +189,16 @@ public void nameFinderDlConcurrentWithSentenceDetectorMe() throws Exception { startGate.await(); for (int i = 0; i < iterationsPerThread; i++) { final Span[] spans = nameFinderDL.find(tokens); - if (spans.length != 1 + // The all-entity decoder yields both the PER and the LOC span for this input. + if (spans.length != 2 || spans[0].getStart() != 0 - || spans[0].getEnd() != 17) { + || spans[0].getEnd() != 17 + || !"PER".equals(spans[0].getType()) + || !"George Washington".equals(spans[0].getCoveredText(text)) + || spans[1].getStart() != 39 + || spans[1].getEnd() != 52 + || !"LOC".equals(spans[1].getType()) + || !"United States".equals(spans[1].getCoveredText(text))) { return false; } } @@ -213,8 +238,9 @@ public void tokenNameFinderSnapshotsInferenceOptionsTest() throws Exception { options, sentenceDetector)) { final Span[] baseline = nameFinderDL.find(tokens); - Assertions.assertEquals(1, baseline.length); + Assertions.assertEquals(2, baseline.length); Assertions.assertEquals("George Washington", baseline[0].getCoveredText(text)); + Assertions.assertEquals("United States", baseline[1].getCoveredText(text)); // Mutate the options in ways that would change inference if they were read live: // a split size of 1 would chunk the input one word at a time. @@ -224,11 +250,14 @@ public void tokenNameFinderSnapshotsInferenceOptionsTest() throws Exception { options.setSplitOverlapSize(0); final Span[] afterMutation = nameFinderDL.find(tokens); - Assertions.assertEquals(1, afterMutation.length, + Assertions.assertEquals(2, afterMutation.length, "mutating InferenceOptions after construction must not affect a built instance"); Assertions.assertEquals(0, afterMutation[0].getStart()); Assertions.assertEquals(17, afterMutation[0].getEnd()); Assertions.assertEquals("George Washington", afterMutation[0].getCoveredText(text)); + Assertions.assertEquals(39, afterMutation[1].getStart()); + Assertions.assertEquals(52, afterMutation[1].getEnd()); + Assertions.assertEquals("United States", afterMutation[1].getCoveredText(text)); } } @@ -253,8 +282,11 @@ public void tokenNameFinder2Test() throws Exception { } Assertions.assertEquals(1, spans.length); + Assertions.assertEquals("PER", spans[0].getType()); Assertions.assertEquals(13, spans[0].getStart()); Assertions.assertEquals(30, spans[0].getEnd()); + Assertions.assertEquals("George Washington", + spans[0].getCoveredText(String.join(" ", tokens))); } } @@ -278,8 +310,10 @@ public void tokenNameFinder3Test() throws Exception { } Assertions.assertEquals(1, spans.length); + Assertions.assertEquals("PER", spans[0].getType()); Assertions.assertEquals(13, spans[0].getStart()); Assertions.assertEquals(19, spans[0].getEnd()); + Assertions.assertEquals("George", spans[0].getCoveredText(String.join(" ", tokens))); } } @@ -342,11 +376,17 @@ public void tokenNameFinderMultipleEntitiesTest() throws Exception { logger.debug(span.toString()); } + final String text = String.join(" ", tokens); + Assertions.assertEquals(2, spans.length); + Assertions.assertEquals("PER", spans[0].getType()); Assertions.assertEquals(0, spans[0].getStart()); Assertions.assertEquals(17, spans[0].getEnd()); + Assertions.assertEquals("George Washington", spans[0].getCoveredText(text)); + Assertions.assertEquals("PER", spans[1].getType()); Assertions.assertEquals(22, spans[1].getStart()); Assertions.assertEquals(37, spans[1].getEnd()); + Assertions.assertEquals("Abraham Lincoln", spans[1].getCoveredText(text)); } From a3c423a2fd499ab8e711b3735aaaba38baced6dd Mon Sep 17 00:00:00 2001 From: Kristian Rickert Date: Thu, 18 Jun 2026 04:23:45 -0400 Subject: [PATCH 02/11] OPENNLP-1846 - Address NameFinderDL review feedback Keep unmapped label ids graceful, bound decoded span lookup to the current sentence, add diagnostics for unlocated decoded spans, and tighten exception types/messages plus helper documentation. --- .../opennlp/dl/namefinder/NameFinderDL.java | 186 +++++++++++++++--- .../dl/namefinder/NameFinderDLTest.java | 54 +++-- 2 files changed, 199 insertions(+), 41 deletions(-) diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java index ea5656ce5..483ea6711 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java +++ b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java @@ -33,6 +33,8 @@ import ai.onnxruntime.OnnxTensor; import ai.onnxruntime.OrtException; import ai.onnxruntime.OrtSession; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; import opennlp.dl.AbstractDL; import opennlp.dl.InferenceOptions; @@ -71,18 +73,27 @@ public class NameFinderDL extends AbstractDL implements TokenNameFinder { public static final String I_PER = "I-PER"; public static final String B_PER = "B-PER"; public static final String SEPARATOR = "[SEP]"; - private static final String CLS_TOKEN = "[CLS]"; + public static final String CLS_TOKEN = "[CLS]"; /** Prefix used by BIO labels for the first token in an entity span. */ - private static final String BEGIN_PREFIX = "B-"; + public static final String PREFIX_BEGIN = "B-"; /** Prefix used by BIO labels for continuation tokens in an entity span. */ - private static final String INSIDE_PREFIX = "I-"; + public static final String PREFIX_INSIDE = "I-"; + + /** Tokens that attach directly to the preceding token when span text is reconstructed. */ + public static final String[] NO_SPACE_BEFORE_TOKENS = + {".", ",", ":", ";", "!", "?", ")", "]", "}", "%", "'", "-", "/"}; + + /** Tokens after which the following token attaches directly when span text is reconstructed. */ + public static final String[] NO_SPACE_AFTER_TOKENS = + {"(", "[", "{", "$", "'", "-", "/"}; /** NER models are commonly cased, so lower casing is off by default. */ private static final boolean LOWER_CASE_DEFAULT = false; private static final String CHARS_TO_REPLACE = "##"; + private static final Logger logger = LoggerFactory.getLogger(NameFinderDL.class); private final SentenceDetector sentenceDetector; private final Map ids2Labels; @@ -149,6 +160,17 @@ private static InferenceOptions validateConstructorArguments( return inferenceOptions; } + /** + * {@inheritDoc} + * + *

This method joins the provided tokens with spaces, sentence-splits the joined text, + * runs each sentence through the ONNX token-classification model, decodes BIO labels into + * {@link Span spans}, and resolves those spans back to character offsets in the joined text.

+ * + * @throws IllegalStateException Thrown if inference fails, if the model output shape is not + * the expected {@code float[batch][token][label]} form, or if the model output contains + * no usable label score for a token. + */ @Override public Span[] find(String[] input) { @@ -174,7 +196,8 @@ public Span[] find(String[] input) { for (final Tokens tokens : wordpieceTokens) { final List decoded = - decodeSpans(text, tokens.tokens(), infer(tokens), ids2Labels, searchStart); + decodeSpans(text, tokens.tokens(), infer(tokens), ids2Labels, searchStart, + sentenceSpan.getEnd()); spans.addAll(decoded); if (!decoded.isEmpty()) { searchStart = decoded.get(decoded.size() - 1).getEnd(); @@ -218,8 +241,12 @@ private float[][] infer(final Tokens tokens) { try (OrtSession.Result result = session.run(inputs)) { output = result.get(0).getValue(); } - } catch (OrtException | RuntimeException ex) { - throw new IllegalStateException("Unable to perform name finder inference", ex); + } catch (OrtException ex) { + throw new IllegalStateException( + "Unable to perform name finder inference: " + ex.getMessage(), ex); + } catch (RuntimeException ex) { + throw new IllegalStateException( + "Unexpected runtime failure during name finder inference: " + ex.getMessage(), ex); } finally { inputs.values().forEach(OnnxTensor::close); } @@ -227,7 +254,10 @@ private float[][] infer(final Tokens tokens) { // The model returns one score row per token, batched: float[batch][token][label]. Any other // shape (or an empty batch) is a model-contract violation, surfaced on its own rather than as // "inference failed". - if (output instanceof float[][][] v && v.length > 0) { + if (output instanceof float[][][] v) { + if (v.length == 0) { + throw new IllegalStateException("Model output batch must contain at least one entry."); + } return v[0]; } throw new IllegalStateException("Unexpected model output type: " @@ -240,14 +270,14 @@ public void clearAdaptiveData() { } /** - * Decodes spans beginning the character search at the start of {@code text}. Equivalent to + * Decodes {@link Span spans} beginning the character search at the start of {@code text}. Equivalent to * {@link #decodeSpans(String, String[], float[][], Map, int)} with {@code searchStart == 0}. * * @param text The original text passed to the model. * @param tokens The WordPiece tokens produced for the text. * @param tokenLabelScores The per-token label scores returned by the model. * @param id2Labels The mapping from model output indexes to BIO labels. - * @return The decoded spans. + * @return The decoded {@link Span spans}. */ static List decodeSpans(String text, String[] tokens, float[][] tokenLabelScores, Map id2Labels) { @@ -255,7 +285,7 @@ static List decodeSpans(String text, String[] tokens, float[][] tokenLabel } /** - * Converts model token classifications into character spans in the original input text. + * Converts model token classifications into character {@link Span spans} in the original input text. * *

The ONNX model returns one score vector for each WordPiece token. This method applies * BIO decoding, reconstructs WordPiece fragments, and then resolves the reconstructed text @@ -269,10 +299,29 @@ static List decodeSpans(String text, String[] tokens, float[][] tokenLabel * @param searchStart The character offset in {@code text} to begin locating spans from. Threading * a monotonic cursor across the chunks and sentences of a single {@link #find(String[])} call * keeps a repeated entity surface form from being emitted twice at the same first occurrence. - * @return The decoded spans. + * @return The decoded {@link Span spans}. */ static List decodeSpans(String text, String[] tokens, float[][] tokenLabelScores, Map id2Labels, int searchStart) { + return decodeSpans(text, tokens, tokenLabelScores, id2Labels, searchStart, text.length()); + } + + /** + * Converts model token classifications into character {@link Span spans} within a bounded + * region of the original input text. + * + * @param text The original text passed to the model. + * @param tokens The WordPiece tokens produced for the text. + * @param tokenLabelScores The per-token label scores returned by the model. + * @param id2Labels The mapping from model output indexes to BIO labels. + * @param searchStart The first character offset in {@code text} to search. + * @param searchEnd The exclusive upper bound for locating reconstructed spans. During + * {@link #find(String[])}, this is the current sentence end so an entity from one sentence + * cannot be resolved to an identical surface form in a later sentence. + * @return The decoded {@link Span spans}. + */ + static List decodeSpans(String text, String[] tokens, float[][] tokenLabelScores, + Map id2Labels, int searchStart, int searchEnd) { if (tokens.length != tokenLabelScores.length) { throw new IllegalArgumentException("The number of tokens (" + tokens.length @@ -289,7 +338,7 @@ static List decodeSpans(String text, String[] tokens, float[][] tokenLabel continue; } - final String entityType = prediction.label().substring(BEGIN_PREFIX.length()); + final String entityType = prediction.label().substring(PREFIX_BEGIN.length()); final EntityPrediction entity = findEntityEnd(tokenLabelScores, x, id2Labels, entityType, prediction.probability()); final String spanText = buildSpanText(tokens, x, entity.endIndex()); @@ -299,10 +348,13 @@ static List decodeSpans(String text, String[] tokens, float[][] tokenLabel continue; } - final SpanMatch match = findByRegex(text, spanText, characterStart); + final SpanMatch match = findByRegex(text, spanText, characterStart, searchEnd); if (match.start() != -1) { spans.add(new Span(match.start(), match.end(), entityType, entity.probability())); characterStart = match.end(); + } else { + logger.debug("Unable to locate decoded {} span '{}' in source text region [{}, {}).", + entityType, spanText, characterStart, searchEnd); } x = entity.endIndex(); @@ -312,12 +364,26 @@ static List decodeSpans(String text, String[] tokens, float[][] tokenLabel } + /** + * Finds the final token index and confidence for one BIO entity that starts at {@code startIndex}. + * + *

The span continues while subsequent predictions are {@code I-}. The returned + * probability is the minimum token probability across the entity, so a multi-token span reflects + * its weakest continuation.

+ * + * @param tokenLabelScores The per-token label scores returned by the model. + * @param startIndex The token index where the entity begins. + * @param id2Labels The mapping from model output indexes to BIO labels. + * @param entityType The entity type without its BIO prefix, for example {@code PER}. + * @param startProbability The normalized probability of the begin label. + * @return The last token index and probability for the entity. + */ private static EntityPrediction findEntityEnd(float[][] tokenLabelScores, int startIndex, Map id2Labels, String entityType, double startProbability) { - final String insideLabel = INSIDE_PREFIX + entityType; + final String insideLabel = PREFIX_INSIDE + entityType; int endIndex = startIndex; double probability = startProbability; @@ -334,23 +400,47 @@ private static EntityPrediction findEntityEnd(float[][] tokenLabelScores, int st } + /** + * Returns whether a label is a well-formed BIO begin label. + * + * @param label The label to inspect. + * @return {@code true} for {@code B-} labels with a non-empty type. + */ private static boolean isBeginLabel(String label) { - return label.startsWith(BEGIN_PREFIX) && label.length() > BEGIN_PREFIX.length(); + return label.startsWith(PREFIX_BEGIN) && label.length() > PREFIX_BEGIN.length(); } + /** + * Picks the predicted BIO label for one token. + * + *

If the model's argmax index is absent from {@code id2Labels}, the token is treated as + * outside ({@code O}). This preserves the previous graceful behavior for partial label maps: + * one unmapped output row does not discard the whole {@link #find(String[])} result.

+ * + * @param scores The model scores for one token. + * @param id2Labels The mapping from model output indexes to BIO labels. + * @return The predicted label and its normalized probability. + */ private static LabelPrediction predictLabel(float[] scores, Map id2Labels) { final int labelIndex = maxIndex(scores); final String label = id2Labels.get(labelIndex); if (label == null) { - throw new IllegalArgumentException("No label is configured for model output index " - + labelIndex + "."); + return new LabelPrediction("O", 0d); } return new LabelPrediction(label, labelProbability(scores, labelIndex)); } + /** + * Normalizes model scores into a probability for one label index using a numerically stable + * softmax. + * + * @param scores The raw model scores for one token. + * @param labelIndex The label index whose probability should be returned. + * @return The normalized probability in {@code [0, 1]}. + */ static double labelProbability(float[] scores, int labelIndex) { int positiveInfinityCount = 0; @@ -386,6 +476,18 @@ static double labelProbability(float[] scores, int labelIndex) { } + /** + * Reconstructs source-like text from a span of WordPiece tokens. + * + *

Special BERT tokens are skipped, {@code ##} continuations are merged into the preceding + * surface form, and simple punctuation spacing is normalized so the result can be located in + * the caller's original text.

+ * + * @param tokens The WordPiece token sequence. + * @param startIndex The first token index to include. + * @param endIndex The last token index to include. + * @return The reconstructed span text. + */ static String buildSpanText(String[] tokens, int startIndex, int endIndex) { final StringBuilder span = new StringBuilder(); @@ -419,20 +521,30 @@ private static boolean shouldInsertSpace(String previousToken, String token) { } private static boolean hasNoSpaceBefore(String token) { - return switch (token) { - case ".", ",", ":", ";", "!", "?", ")", "]", "}", "%", "'", "-", "/" -> true; - default -> false; - }; + return containsToken(NO_SPACE_BEFORE_TOKENS, token); } private static boolean hasNoSpaceAfter(String token) { - return switch (token) { - case "(", "[", "{", "$", "'", "-", "/" -> true; - default -> false; - }; + return containsToken(NO_SPACE_AFTER_TOKENS, token); } - static int maxIndex(float[] arr) { + private static boolean containsToken(String[] tokens, String token) { + for (String candidate : tokens) { + if (candidate.equals(token)) { + return true; + } + } + return false; + } + + /** + * Returns the index of the largest non-NaN score. + * + * @param arr The score array to inspect. + * @return The index of the maximum non-NaN value. + * @throws IllegalStateException Thrown if the model output contains no non-NaN score. + */ + private static int maxIndex(float[] arr) { double max = Float.NEGATIVE_INFINITY; int index = -1; @@ -445,7 +557,7 @@ static int maxIndex(float[] arr) { } if (index == -1) { - throw new IllegalArgumentException( + throw new IllegalStateException( "Model output scores must contain at least one non-NaN value."); } @@ -453,7 +565,17 @@ static int maxIndex(float[] arr) { } - private static SpanMatch findByRegex(String text, String span, int searchStart) { + /** + * Locates reconstructed span text in a bounded region of the original input text. + * + * @param text The original text. + * @param span The reconstructed span text. + * @param searchStart The first character offset to search from. + * @param searchEnd The exclusive upper bound of the region to search. + * @return The matched character offsets, or {@code (-1, -1)} when the reconstructed text + * cannot be found in the requested region. + */ + private static SpanMatch findByRegex(String text, String span, int searchStart, int searchEnd) { // Reconstructed span text normalizes whitespace, so match flexibly: a space in the span may // map to any run of whitespace OR none in the source (e.g. punctuation/'&' inside "U.S.A", @@ -463,7 +585,9 @@ private static SpanMatch findByRegex(String text, String span, int searchStart) final Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE); final Matcher matcher = pattern.matcher(text); - matcher.region(Math.min(Math.max(searchStart, 0), text.length()), text.length()); + final int regionStart = Math.min(Math.max(searchStart, 0), text.length()); + final int regionEnd = Math.min(Math.max(searchEnd, regionStart), text.length()); + matcher.region(regionStart, regionEnd); if (matcher.find()) { return new SpanMatch(matcher.start(), matcher.end()); @@ -479,6 +603,10 @@ private record LabelPrediction(String label, double probability) { private record EntityPrediction(int endIndex, double probability) { } + /** + * Character offsets for a matched span. {@code (-1, -1)} means the reconstructed entity text + * could not be located in the searched source-text region. + */ private record SpanMatch(int start, int end) { } diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/namefinder/NameFinderDLTest.java b/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/namefinder/NameFinderDLTest.java index e342634a6..191144874 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/namefinder/NameFinderDLTest.java +++ b/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/namefinder/NameFinderDLTest.java @@ -128,7 +128,7 @@ void testDecodeSpansIgnoresMalformedBeginLabels() { } @Test - void testDecodeSpansRejectsMissingPredictedLabels() { + void testDecodeSpansTreatsMissingPredictedLabelsAsOutside() { final String text = "Alice visited."; final String[] tokens = {"[CLS]", "Alice", "visited", ".", "[SEP]"}; final float[][] scores = { @@ -136,11 +136,9 @@ void testDecodeSpansRejectsMissingPredictedLabels() { }; final Map incompleteLabels = Map.of(0, "O"); - final IllegalArgumentException e = assertThrows(IllegalArgumentException.class, () -> - NameFinderDL.decodeSpans(text, tokens, scores, incompleteLabels)); + final List spans = NameFinderDL.decodeSpans(text, tokens, scores, incompleteLabels); - assertTrue(e.getMessage().contains("1"), - "the error message should name the missing label id: " + e.getMessage()); + assertTrue(spans.isEmpty()); } @Test @@ -184,6 +182,18 @@ void testDecodeSpansLocatesEntityWithInternalPunctuation() { assertEquals("AT&T", spans.get(0).getCoveredText(text)); } + @Test + void testDecodeSpansDoesNotMatchBeyondSearchEnd() { + final String text = "London was quiet. Later Paris was loud."; + final String[] tokens = {"[CLS]", "Paris", "[SEP]"}; + final float[][] scores = {scoresFor(0), scoresFor(3), scoresFor(0)}; + + final List spans = NameFinderDL.decodeSpans( + text, tokens, scores, ID_TO_LABELS, 0, text.indexOf(" Later")); + + assertTrue(spans.isEmpty()); + } + @Test void testDecodeSpansMatchesSourceCaseInsensitively() { // The reconstructed span text may differ in case from the source (e.g. an uncased model); @@ -202,16 +212,30 @@ void testDecodeSpansMatchesSourceCaseInsensitively() { } @Test - void testMaxIndexSkipsNaNAndPicksLargestFinite() { - assertEquals(1, NameFinderDL.maxIndex(new float[] {Float.NaN, 5f, -5f})); + void testDecodeSpansSkipsNaNAndPicksLargestFinite() { + final String text = "Alice visited."; + final String[] tokens = {"[CLS]", "Alice", "visited", ".", "[SEP]"}; + final float[][] scores = { + scoresFor(0), scoresWithNaN(1), scoresFor(0), scoresFor(0), scoresFor(0) + }; + + final List spans = NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS); + + assertEquals(1, spans.size()); + assertEquals("Alice", spans.get(0).getCoveredText(text)); } @Test - void testMaxIndexRejectsAllNaNOrEmptyScores() { - assertThrows(IllegalArgumentException.class, - () -> NameFinderDL.maxIndex(new float[] {Float.NaN, Float.NaN})); - assertThrows(IllegalArgumentException.class, - () -> NameFinderDL.maxIndex(new float[0])); + void testDecodeSpansRejectsAllNaNOrEmptyScores() { + final String text = "Alice visited."; + final String[] tokens = {"[CLS]", "Alice", "visited", ".", "[SEP]"}; + + assertThrows(IllegalStateException.class, () -> NameFinderDL.decodeSpans(text, tokens, + new float[][] {scoresFor(0), new float[] {Float.NaN, Float.NaN}, scoresFor(0), + scoresFor(0), scoresFor(0)}, ID_TO_LABELS)); + assertThrows(IllegalStateException.class, () -> NameFinderDL.decodeSpans(text, tokens, + new float[][] {scoresFor(0), new float[0], scoresFor(0), scoresFor(0), scoresFor(0)}, + ID_TO_LABELS)); } @Test @@ -257,6 +281,12 @@ private static float[] scoresFor(int labelIndex) { return scores; } + private static float[] scoresWithNaN(int labelIndex) { + final float[] scores = scoresFor(labelIndex); + scores[0] = Float.NaN; + return scores; + } + private static void assertProbability(Span span) { assertTrue(span.getProb() > 0 && span.getProb() <= 1, "span probability should be normalized to (0, 1]: " + span.getProb()); From 0ada8a83594d3a5d98d1d7721523937acaa4e998 Mon Sep 17 00:00:00 2001 From: Kristian Rickert Date: Thu, 18 Jun 2026 04:34:37 -0400 Subject: [PATCH 03/11] OPENNLP-1846 - Harden NameFinderDL constants and fail loud on unmapped labels Make the public no-space token constants immutable (Set.of instead of mutable arrays) while keeping them public for third-party use. Fail loud on an unmapped model output index: predictLabel now throws an IllegalStateException naming the index instead of degrading the token to "O", and the constructors document that ids2Labels must be exhaustive over the model's output indices. Also document the IllegalArgumentException that find() can raise on a vocabulary/model mismatch. Add edge-case decoding tests: token/score count mismatch, orphan I- labels, adjacent entities of different types, multi-token minimum-probability semantics, repeated entities at distinct offsets within one call, regex metacharacters in span text, and search-start clamping past end of text. --- .../opennlp/dl/namefinder/NameFinderDL.java | 47 +++--- .../dl/namefinder/NameFinderDLTest.java | 135 +++++++++++++++++- 2 files changed, 155 insertions(+), 27 deletions(-) diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java index 483ea6711..e5b5c89b5 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java +++ b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java @@ -27,6 +27,7 @@ import java.util.List; import java.util.Map; import java.util.Objects; +import java.util.Set; import java.util.regex.Matcher; import java.util.regex.Pattern; @@ -82,12 +83,12 @@ public class NameFinderDL extends AbstractDL implements TokenNameFinder { public static final String PREFIX_INSIDE = "I-"; /** Tokens that attach directly to the preceding token when span text is reconstructed. */ - public static final String[] NO_SPACE_BEFORE_TOKENS = - {".", ",", ":", ";", "!", "?", ")", "]", "}", "%", "'", "-", "/"}; + public static final Set NO_SPACE_BEFORE_TOKENS = + Set.of(".", ",", ":", ";", "!", "?", ")", "]", "}", "%", "'", "-", "/"); /** Tokens after which the following token attaches directly when span text is reconstructed. */ - public static final String[] NO_SPACE_AFTER_TOKENS = - {"(", "[", "{", "$", "'", "-", "/"}; + public static final Set NO_SPACE_AFTER_TOKENS = + Set.of("(", "[", "{", "$", "'", "-", "/"); /** NER models are commonly cased, so lower casing is off by default. */ private static final boolean LOWER_CASE_DEFAULT = false; @@ -109,7 +110,9 @@ public class NameFinderDL extends AbstractDL implements TokenNameFinder { * * @param model The ONNX model file. * @param vocabulary The model file's vocabulary file. - * @param ids2Labels The mapping of ids to labels. + * @param ids2Labels The mapping of model output indices to BIO labels. This must be exhaustive + * over the model's output indices; a token whose predicted index is unmapped raises an + * {@link IllegalStateException} during {@link #find(String[])}. * @param sentenceDetector The {@link SentenceDetector} to be used. * * @throws OrtException Thrown if the {@code model} cannot be loaded. @@ -127,7 +130,9 @@ public NameFinderDL(File model, File vocabulary, Map ids2Labels * * @param model The ONNX model file. * @param vocabulary The model file's vocabulary file. - * @param ids2Labels The mapping of ids to labels. + * @param ids2Labels The mapping of model output indices to BIO labels. This must be exhaustive + * over the model's output indices; a token whose predicted index is unmapped raises an + * {@link IllegalStateException} during {@link #find(String[])}. * @param inferenceOptions {@link InferenceOptions} to control the inference. * @param sentenceDetector The {@link SentenceDetector} to be used. * @@ -168,8 +173,11 @@ private static InferenceOptions validateConstructorArguments( * {@link Span spans}, and resolves those spans back to character offsets in the joined text.

* * @throws IllegalStateException Thrown if inference fails, if the model output shape is not - * the expected {@code float[batch][token][label]} form, or if the model output contains - * no usable label score for a token. + * the expected {@code float[batch][token][label]} form, if the model output contains + * no usable label score for a token, or if the model's predicted index for a token is not + * present in the configured label map. + * @throws IllegalArgumentException Thrown if a token produced for the input is not present in + * the vocabulary, which indicates the vocabulary file does not match the model. */ @Override public Span[] find(String[] input) { @@ -413,20 +421,20 @@ private static boolean isBeginLabel(String label) { /** * Picks the predicted BIO label for one token. * - *

If the model's argmax index is absent from {@code id2Labels}, the token is treated as - * outside ({@code O}). This preserves the previous graceful behavior for partial label maps: - * one unmapped output row does not discard the whole {@link #find(String[])} result.

- * * @param scores The model scores for one token. * @param id2Labels The mapping from model output indexes to BIO labels. * @return The predicted label and its normalized probability. + * @throws IllegalStateException Thrown if the model's argmax index is absent from + * {@code id2Labels}, which means the label map is not exhaustive over the model's output + * indices and the model/label-map pair is misconfigured. */ private static LabelPrediction predictLabel(float[] scores, Map id2Labels) { final int labelIndex = maxIndex(scores); final String label = id2Labels.get(labelIndex); if (label == null) { - return new LabelPrediction("O", 0d); + throw new IllegalStateException("Model output index " + labelIndex + + " has no configured label; ids2Labels must map every model output index."); } return new LabelPrediction(label, labelProbability(scores, labelIndex)); @@ -521,20 +529,11 @@ private static boolean shouldInsertSpace(String previousToken, String token) { } private static boolean hasNoSpaceBefore(String token) { - return containsToken(NO_SPACE_BEFORE_TOKENS, token); + return NO_SPACE_BEFORE_TOKENS.contains(token); } private static boolean hasNoSpaceAfter(String token) { - return containsToken(NO_SPACE_AFTER_TOKENS, token); - } - - private static boolean containsToken(String[] tokens, String token) { - for (String candidate : tokens) { - if (candidate.equals(token)) { - return true; - } - } - return false; + return NO_SPACE_AFTER_TOKENS.contains(token); } /** diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/namefinder/NameFinderDLTest.java b/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/namefinder/NameFinderDLTest.java index 191144874..c0a8aede2 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/namefinder/NameFinderDLTest.java +++ b/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/namefinder/NameFinderDLTest.java @@ -128,7 +128,7 @@ void testDecodeSpansIgnoresMalformedBeginLabels() { } @Test - void testDecodeSpansTreatsMissingPredictedLabelsAsOutside() { + void testDecodeSpansRejectsMissingPredictedLabels() { final String text = "Alice visited."; final String[] tokens = {"[CLS]", "Alice", "visited", ".", "[SEP]"}; final float[][] scores = { @@ -136,9 +136,11 @@ void testDecodeSpansTreatsMissingPredictedLabelsAsOutside() { }; final Map incompleteLabels = Map.of(0, "O"); - final List spans = NameFinderDL.decodeSpans(text, tokens, scores, incompleteLabels); + final IllegalStateException e = assertThrows(IllegalStateException.class, () -> + NameFinderDL.decodeSpans(text, tokens, scores, incompleteLabels)); - assertTrue(spans.isEmpty()); + assertTrue(e.getMessage().contains("1"), + "the error message should name the missing label id: " + e.getMessage()); } @Test @@ -238,6 +240,122 @@ void testDecodeSpansRejectsAllNaNOrEmptyScores() { ID_TO_LABELS)); } + @Test + void testDecodeSpansRejectsTokenScoreCountMismatch() { + // Fewer score rows than tokens is a model/tokenizer contract violation; the message must name + // both counts so the mismatch is debuggable. + final String text = "Alice visited."; + final String[] tokens = {"[CLS]", "Alice", "visited", ".", "[SEP]"}; + final float[][] scores = {scoresFor(0), scoresFor(1)}; + + final IllegalArgumentException e = assertThrows(IllegalArgumentException.class, () -> + NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS)); + + assertTrue(e.getMessage().contains("5") && e.getMessage().contains("2"), + "the error message should name both counts: " + e.getMessage()); + } + + @Test + void testDecodeSpansIgnoresInsideLabelWithoutBegin() { + // An I-LOC with no preceding B-LOC is not a valid span start and must not emit an entity. + final String text = "Visit Paris today"; + final String[] tokens = {"[CLS]", "Visit", "Paris", "today", "[SEP]"}; + final float[][] scores = { + scoresFor(0), scoresFor(0), scoresFor(4), scoresFor(0), scoresFor(0) + }; + + final List spans = NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS); + + assertTrue(spans.isEmpty()); + } + + @Test + void testDecodeSpansSeparatesAdjacentEntitiesOfDifferentTypes() { + // B-PER directly followed by B-LOC must yield two distinct single-token spans, not one merged + // span: findEntityEnd stops at the type change and the outer loop resumes at the next begin. + final String text = "Alice Paris"; + final String[] tokens = {"[CLS]", "Alice", "Paris", "[SEP]"}; + final float[][] scores = {scoresFor(0), scoresFor(1), scoresFor(3), scoresFor(0)}; + + final List spans = NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS); + + assertEquals(2, spans.size()); + assertEquals("PER", spans.get(0).getType()); + assertEquals("Alice", spans.get(0).getCoveredText(text)); + assertEquals("LOC", spans.get(1).getType()); + assertEquals("Paris", spans.get(1).getCoveredText(text)); + } + + @Test + void testMultiTokenSpanProbabilityIsWeakestTokenProbability() { + // The probability of a multi-token entity is the minimum across its tokens, so a confident + // begin followed by a weak continuation reports the weak continuation's probability. + final String text = "New York"; + final String[] tokens = {"[CLS]", "New", "York", "[SEP]"}; + final float[] strongBegin = scoresFor(3); + final float[] weakInside = weakScoresFor(4); + final float[][] scores = {scoresFor(0), strongBegin, weakInside, scoresFor(0)}; + + final List spans = NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS); + + assertEquals(1, spans.size()); + assertEquals("New York", spans.get(0).getCoveredText(text)); + assertEquals(NameFinderDL.labelProbability(weakInside, 4), spans.get(0).getProb(), 1e-9); + assertTrue(spans.get(0).getProb() < NameFinderDL.labelProbability(strongBegin, 3), + "multi-token span should reflect its weakest continuation"); + } + + @Test + void testDecodeSpansEmitsRepeatedEntityAtDistinctOffsets() { + // Two identical surface forms within a single call must resolve to distinct, non-overlapping + // spans via the internal monotonic cursor rather than both matching the first occurrence. + final String text = "Paris and Paris"; + final String[] tokens = {"[CLS]", "Paris", "and", "Paris", "[SEP]"}; + final float[][] scores = { + scoresFor(0), scoresFor(3), scoresFor(0), scoresFor(3), scoresFor(0) + }; + + final List spans = NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS); + + assertEquals(2, spans.size()); + assertEquals(0, spans.get(0).getStart()); + assertEquals(5, spans.get(0).getEnd()); + assertEquals(10, spans.get(1).getStart()); + assertEquals(15, spans.get(1).getEnd()); + } + + @Test + void testDecodeSpansLocatesEntityWithRegexMetacharacters() { + // WordPiece splits "C++" into C / + / + tokens, so the reconstructed span text contains regex + // metacharacters. Pattern.quote must treat them literally (not as quantifiers) for the entity + // to be located in the source. + final String text = "Love C++ today"; + final String[] tokens = {"[CLS]", "Love", "C", "+", "+", "today", "[SEP]"}; + final float[][] scores = { + scoresFor(0), scoresFor(0), scoresFor(5), scoresFor(6), scoresFor(6), + scoresFor(0), scoresFor(0) + }; + + final List spans = NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS); + + assertEquals(1, spans.size()); + assertEquals("ORG", spans.get(0).getType()); + assertEquals("C++", spans.get(0).getCoveredText(text)); + } + + @Test + void testDecodeSpansClampsSearchStartBeyondText() { + // A searchStart past the end of the text must clamp to an empty region and yield no match + // rather than throwing an out-of-bounds error. + final String text = "Paris"; + final String[] tokens = {"[CLS]", "Paris", "[SEP]"}; + final float[][] scores = {scoresFor(0), scoresFor(3), scoresFor(0)}; + + final List spans = NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS, 999); + + assertTrue(spans.isEmpty()); + } + @Test void testLabelProbabilityIsBoundedStableSoftmax() { // Reference (numpy): softmax([1,2,3])[2] = 0.66524096. @@ -287,6 +405,17 @@ private static float[] scoresWithNaN(int labelIndex) { return scores; } + // Lower-margin scores than scoresFor, so the chosen label's softmax probability is well below 1 + // and a multi-token span's minimum-probability behavior is observable. + private static float[] weakScoresFor(int labelIndex) { + final float[] scores = new float[ID_TO_LABELS.size()]; + for (int i = 0; i < scores.length; i++) { + scores[i] = -1; + } + scores[labelIndex] = 1; + return scores; + } + private static void assertProbability(Span span) { assertTrue(span.getProb() > 0 && span.getProb() <= 1, "span probability should be normalized to (0, 1]: " + span.getProb()); From 0d53e31bb3f6774ad4eb2ad3c8171b64ef5bfdce Mon Sep 17 00:00:00 2001 From: Kristian Rickert Date: Thu, 18 Jun 2026 22:28:13 -0400 Subject: [PATCH 04/11] OPENNLP-1850 - Add robust character sequence normalization utilities and tests Co-authored-by: Junie Signed-off-by: Kristian Rickert --- opennlp-api/pom.xml | 6 + .../tools/util/normalizer/CharClass.java | 383 ++++++++++++++++++ .../tools/util/normalizer/CodePointSet.java | 245 +++++++++++ .../tools/util/normalizer/NormalizedText.java | 51 +++ .../tools/util/normalizer/OffsetMap.java | 135 ++++++ .../tools/util/normalizer/UnicodeDash.java | 189 +++++++++ .../util/normalizer/UnicodeWhitespace.java | 242 +++++++++++ .../tools/util/normalizer/CharClassTest.java | 292 +++++++++++++ .../util/normalizer/CodePointSetTest.java | 241 +++++++++++ .../util/normalizer/UnicodeDashTest.java | 170 ++++++++ .../normalizer/UnicodeWhitespaceTest.java | 239 +++++++++++ .../src/main/java/opennlp/dl/AbstractDL.java | 33 ++ .../dl/doccat/DocumentCategorizerDL.java | 13 +- .../opennlp/dl/namefinder/NameFinderDL.java | 90 ++-- .../opennlp/dl/AbstractDLChunkingTest.java | 61 +++ .../dl/namefinder/NameFinderDLTest.java | 33 +- .../AccentFoldCharSequenceNormalizer.java | 133 ++++++ .../CaseFoldCharSequenceNormalizer.java | 47 +++ .../DashCharSequenceNormalizer.java | 45 ++ .../normalizer/NfcCharSequenceNormalizer.java | 45 ++ .../NfkcCharSequenceNormalizer.java | 46 +++ .../WhitespaceCharSequenceNormalizer.java | 46 +++ .../AccentFoldCharSequenceNormalizerTest.java | 115 ++++++ .../UnicodeCharSequenceNormalizerTest.java | 97 +++++ 24 files changed, 2960 insertions(+), 37 deletions(-) create mode 100644 opennlp-api/src/main/java/opennlp/tools/util/normalizer/CharClass.java create mode 100644 opennlp-api/src/main/java/opennlp/tools/util/normalizer/CodePointSet.java create mode 100644 opennlp-api/src/main/java/opennlp/tools/util/normalizer/NormalizedText.java create mode 100644 opennlp-api/src/main/java/opennlp/tools/util/normalizer/OffsetMap.java create mode 100644 opennlp-api/src/main/java/opennlp/tools/util/normalizer/UnicodeDash.java create mode 100644 opennlp-api/src/main/java/opennlp/tools/util/normalizer/UnicodeWhitespace.java create mode 100644 opennlp-api/src/test/java/opennlp/tools/util/normalizer/CharClassTest.java create mode 100644 opennlp-api/src/test/java/opennlp/tools/util/normalizer/CodePointSetTest.java create mode 100644 opennlp-api/src/test/java/opennlp/tools/util/normalizer/UnicodeDashTest.java create mode 100644 opennlp-api/src/test/java/opennlp/tools/util/normalizer/UnicodeWhitespaceTest.java create mode 100644 opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/AbstractDLChunkingTest.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/AccentFoldCharSequenceNormalizer.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/CaseFoldCharSequenceNormalizer.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/DashCharSequenceNormalizer.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/NfcCharSequenceNormalizer.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/NfkcCharSequenceNormalizer.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/WhitespaceCharSequenceNormalizer.java create mode 100644 opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/AccentFoldCharSequenceNormalizerTest.java create mode 100644 opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/UnicodeCharSequenceNormalizerTest.java diff --git a/opennlp-api/pom.xml b/opennlp-api/pom.xml index 05404d154..516d9baec 100644 --- a/opennlp-api/pom.xml +++ b/opennlp-api/pom.xml @@ -49,6 +49,12 @@ junit-jupiter-engine test + + + org.junit.jupiter + junit-jupiter-params + test + \ No newline at end of file diff --git a/opennlp-api/src/main/java/opennlp/tools/util/normalizer/CharClass.java b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/CharClass.java new file mode 100644 index 000000000..766f3324e --- /dev/null +++ b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/CharClass.java @@ -0,0 +1,383 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.ArrayList; +import java.util.List; +import java.util.Objects; + +import opennlp.tools.util.Span; + +/** + * A configurable class of Unicode code points and the cursor based operations over it. + * + *

A {@code CharClass} pairs a {@link CodePointSet} of member code points with a single + * canonical ASCII {@code replacement} code point. Whitespace and dashes are the two built-in + * presets ({@link #whitespace()}, {@link #dashes()}); any other class is one more configured + * instance with no new engine code.

+ * + *

Every operation is a single forward pass that reads one code point + * ({@link Character#codePointAt(CharSequence, int)}), tests membership in O(1), acts, and advances + * by {@link Character#charCount(int)}. There is no regular expression, no {@link java.util.regex} + * allocation, and no reliance on {@link Character#isWhitespace(int)} or + * {@link Character#isSpaceChar(int)}, all of which disagree with the Unicode standard.

+ * + *

Instances are immutable and thread-safe.

+ */ +public final class CharClass { + + private static final CharClass WHITESPACE = + new CharClass(CodePointSet.of(UnicodeWhitespace.codePoints()), 0x0020); + private static final CharClass DASHES = + new CharClass(CodePointSet.of(UnicodeDash.defaultDashCodePoints()), UnicodeDash.HYPHEN_MINUS); + + private final CodePointSet members; + private final int replacement; + + private CharClass(CodePointSet members, int replacement) { + this.members = members; + this.replacement = replacement; + } + + /** + * Creates a class from a member set and a replacement code point. + * + * @param members The member code points. + * @param replacement The canonical code point used by {@link #normalize(CharSequence)} and + * {@link #collapse(CharSequence)}. + * @return The class. + * @throws IllegalArgumentException Thrown if {@code replacement} is not a valid code point. + */ + public static CharClass of(CodePointSet members, int replacement) { + Objects.requireNonNull(members, "members"); + requireValidCodePoint(replacement); + return new CharClass(members, replacement); + } + + /** {@return the whitespace preset: the Unicode {@code White_Space} set, replacement {@code U+0020}} */ + public static CharClass whitespace() { + return WHITESPACE; + } + + /** + * {@return the dash preset: the Unicode {@code Dash} set excluding the mathematical minus signs, + * replacement {@code U+002D}} + */ + public static CharClass dashes() { + return DASHES; + } + + /** + * Returns a copy of this class whose member set is extended with {@code extra} (for example, + * user-defined code points loaded from {@link CodePointSet#fromFile}). + * + * @param extra The additional member code points. + * @return A new {@code CharClass}; this instance is unchanged. + */ + public CharClass withAdditional(CodePointSet extra) { + Objects.requireNonNull(extra, "extra"); + return new CharClass(members.union(extra), replacement); + } + + /** {@return the member code points of this class} */ + public CodePointSet members() { + return members; + } + + /** {@return the canonical replacement code point} */ + public int replacement() { + return replacement; + } + + /** + * Tests membership. + * + * @param codePoint The code point to test. + * @return {@code true} if the code point is a member of this class. + */ + public boolean contains(int codePoint) { + return members.contains(codePoint); + } + + /** + * Splits text into the maximal runs of non-member code points, as character spans into the + * original text. Runs of members are delimiters and produce no empty spans. + * + * @param text The text to split. + * @return The token spans, in order. + */ + public List splitSpans(CharSequence text) { + Objects.requireNonNull(text, "text"); + final List spans = new ArrayList<>(); + final int length = text.length(); + int tokenStart = -1; + int i = 0; + while (i < length) { + final int codePoint = Character.codePointAt(text, i); + if (members.contains(codePoint)) { + if (tokenStart >= 0) { + spans.add(new Span(tokenStart, i)); + tokenStart = -1; + } + } else if (tokenStart < 0) { + tokenStart = i; + } + i += Character.charCount(codePoint); + } + if (tokenStart >= 0) { + spans.add(new Span(tokenStart, length)); + } + return spans; + } + + /** + * Splits text into the maximal runs of non-member code points. + * + * @param text The text to split. + * @return The tokens, in order, with no empty entries. + */ + public String[] split(CharSequence text) { + final List spans = splitSpans(text); + final String[] tokens = new String[spans.size()]; + for (int i = 0; i < spans.size(); i++) { + final Span span = spans.get(i); + tokens[i] = text.subSequence(span.getStart(), span.getEnd()).toString(); + } + return tokens; + } + + /** + * Replaces each member code point with the replacement, one for one. + * + * @param text The text to normalize. + * @return The normalized text. + */ + public String normalize(CharSequence text) { + Objects.requireNonNull(text, "text"); + final StringBuilder out = new StringBuilder(text.length()); + final int length = text.length(); + int i = 0; + while (i < length) { + final int codePoint = Character.codePointAt(text, i); + out.appendCodePoint(members.contains(codePoint) ? replacement : codePoint); + i += Character.charCount(codePoint); + } + return out.toString(); + } + + /** + * Collapses each maximal run of member code points to a single replacement. + * + * @param text The text to collapse. + * @return The collapsed text. + */ + public String collapse(CharSequence text) { + Objects.requireNonNull(text, "text"); + final StringBuilder out = new StringBuilder(text.length()); + final int length = text.length(); + int i = 0; + while (i < length) { + final int codePoint = Character.codePointAt(text, i); + if (members.contains(codePoint)) { + out.appendCodePoint(replacement); + i = skipRun(text, i); + } else { + out.appendCodePoint(codePoint); + i += Character.charCount(codePoint); + } + } + return out.toString(); + } + + /** + * Collapses runs of members like {@link #collapse(CharSequence)}, but emits + * {@code keepReplacement} instead of the usual replacement for any run that contains a code + * point in {@code keep}. The whitespace "squish" that preserves a line break uses this with the + * line-break code points as {@code keep} and {@code '\n'} as {@code keepReplacement}. + * + * @param text The text to collapse. + * @param keep The member code points whose presence in a run preserves structure. + * @param keepReplacement The replacement emitted for a run that contains a {@code keep} member. + * @return The collapsed text. + * @throws IllegalArgumentException Thrown if {@code keepReplacement} is not a valid code point. + */ + public String collapsePreserving(CharSequence text, CodePointSet keep, int keepReplacement) { + Objects.requireNonNull(text, "text"); + Objects.requireNonNull(keep, "keep"); + requireValidCodePoint(keepReplacement); + final StringBuilder out = new StringBuilder(text.length()); + final int length = text.length(); + int i = 0; + while (i < length) { + final int codePoint = Character.codePointAt(text, i); + if (members.contains(codePoint)) { + boolean preserve = keep.contains(codePoint); + int j = i + Character.charCount(codePoint); + while (j < length) { + final int next = Character.codePointAt(text, j); + if (!members.contains(next)) { + break; + } + preserve |= keep.contains(next); + j += Character.charCount(next); + } + out.appendCodePoint(preserve ? keepReplacement : replacement); + i = j; + } else { + out.appendCodePoint(codePoint); + i += Character.charCount(codePoint); + } + } + return out.toString(); + } + + /** + * Removes leading and trailing member code points. + * + * @param text The text to trim. + * @return The trimmed text. + */ + public String trim(CharSequence text) { + Objects.requireNonNull(text, "text"); + final int length = text.length(); + int start = 0; + while (start < length) { + final int codePoint = Character.codePointAt(text, start); + if (!members.contains(codePoint)) { + break; + } + start += Character.charCount(codePoint); + } + int end = length; + while (end > start) { + final int codePoint = Character.codePointBefore(text, end); + if (!members.contains(codePoint)) { + break; + } + end -= Character.charCount(codePoint); + } + return text.subSequence(start, end).toString(); + } + + /** + * Removes every member code point. + * + * @param text The text to filter. + * @return The text with all members removed. + */ + public String removeAll(CharSequence text) { + Objects.requireNonNull(text, "text"); + final StringBuilder out = new StringBuilder(text.length()); + final int length = text.length(); + int i = 0; + while (i < length) { + final int codePoint = Character.codePointAt(text, i); + if (!members.contains(codePoint)) { + out.appendCodePoint(codePoint); + } + i += Character.charCount(codePoint); + } + return out.toString(); + } + + /** + * Like {@link #normalize(CharSequence)} but also produces the {@link OffsetMap} back to the + * original text. + * + * @param text The text to normalize. + * @return The normalized text and its offset map. + */ + public NormalizedText normalizeMapped(CharSequence text) { + Objects.requireNonNull(text, "text"); + final StringBuilder out = new StringBuilder(text.length()); + final OffsetMap.Builder offsets = new OffsetMap.Builder(); + final int length = text.length(); + int i = 0; + while (i < length) { + final int codePoint = Character.codePointAt(text, i); + if (members.contains(codePoint)) { + appendMapped(out, replacement, offsets, i, i); + } else { + appendMapped(out, codePoint, offsets, i, i + 1); + } + i += Character.charCount(codePoint); + } + return new NormalizedText(text, out.toString(), offsets.build(length)); + } + + /** + * Like {@link #collapse(CharSequence)} but also produces the {@link OffsetMap} back to the + * original text. Each collapsed run maps to the run's start offset. + * + * @param text The text to collapse. + * @return The collapsed text and its offset map. + */ + public NormalizedText collapseMapped(CharSequence text) { + Objects.requireNonNull(text, "text"); + final StringBuilder out = new StringBuilder(text.length()); + final OffsetMap.Builder offsets = new OffsetMap.Builder(); + final int length = text.length(); + int i = 0; + while (i < length) { + final int codePoint = Character.codePointAt(text, i); + if (members.contains(codePoint)) { + appendMapped(out, replacement, offsets, i, i); + i = skipRun(text, i); + } else { + appendMapped(out, codePoint, offsets, i, i + 1); + i += Character.charCount(codePoint); + } + } + return new NormalizedText(text, out.toString(), offsets.build(length)); + } + + // Appends one code point to the output and records an original offset for each output char. + // firstOffset maps the first (or only) char; secondOffset maps the low surrogate of a + // supplementary code point. + private static void appendMapped(StringBuilder out, int codePoint, OffsetMap.Builder offsets, + int firstOffset, int secondOffset) { + if (Character.isBmpCodePoint(codePoint)) { + out.append((char) codePoint); + offsets.map(firstOffset); + } else { + out.append(Character.highSurrogate(codePoint)); + offsets.map(firstOffset); + out.append(Character.lowSurrogate(codePoint)); + offsets.map(secondOffset); + } + } + + // Returns the offset just past the maximal run of members starting at runStart. + private int skipRun(CharSequence text, int runStart) { + final int length = text.length(); + int i = runStart; + while (i < length) { + final int codePoint = Character.codePointAt(text, i); + if (!members.contains(codePoint)) { + break; + } + i += Character.charCount(codePoint); + } + return i; + } + + private static void requireValidCodePoint(int codePoint) { + if (codePoint < 0 || codePoint > Character.MAX_CODE_POINT) { + throw new IllegalArgumentException("Not a Unicode code point: " + codePoint); + } + } +} diff --git a/opennlp-api/src/main/java/opennlp/tools/util/normalizer/CodePointSet.java b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/CodePointSet.java new file mode 100644 index 000000000..a15b005b0 --- /dev/null +++ b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/CodePointSet.java @@ -0,0 +1,245 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.io.IOException; +import java.nio.charset.StandardCharsets; +import java.nio.file.Files; +import java.nio.file.Path; +import java.util.BitSet; +import java.util.List; +import java.util.Locale; +import java.util.Objects; + +/** + * An immutable set of Unicode code points with O(1) membership. + * + *

Backed by a {@link BitSet} keyed directly by code point, so {@link #contains(int)} is a + * single array-word read with no boxing, hashing, or branching beyond a range check. Memory is + * bounded by the largest member code point (the whole of Unicode would cost about {@code 136 KiB}, + * and the standard whitespace and dash sets are entirely or almost entirely in the Basic + * Multilingual Plane, so a few kilobytes in practice).

+ * + *

This type carries no opinion about what the code points mean. It is the explicit, + * standards-sourced data layer that {@link CharClass} and the reference tables + * ({@link UnicodeWhitespace}, {@link UnicodeDash}) are built from, and that users extend or + * override through {@link #fromFile(Path, String)}.

+ */ +public final class CodePointSet { + + private final BitSet members; + + private CodePointSet(BitSet members) { + this.members = members; + } + + /** + * Creates a set from explicit code points. + * + * @param codePoints The code points to include. + * @return The set. + * @throws IllegalArgumentException Thrown if any value is not a valid Unicode code point + * (outside {@code [0, U+10FFFF]}). + */ + public static CodePointSet of(int... codePoints) { + final BitSet members = new BitSet(); + for (final int codePoint : codePoints) { + requireValid(codePoint); + members.set(codePoint); + } + return new CodePointSet(members); + } + + /** + * Creates a set covering an inclusive code point range. + * + * @param firstInclusive The first code point in the range. + * @param lastInclusive The last code point in the range. + * @return The set. + * @throws IllegalArgumentException Thrown if either bound is invalid or {@code firstInclusive} + * is greater than {@code lastInclusive}. + */ + public static CodePointSet ofRange(int firstInclusive, int lastInclusive) { + requireValid(firstInclusive); + requireValid(lastInclusive); + if (firstInclusive > lastInclusive) { + throw new IllegalArgumentException("Range start " + firstInclusive + + " must not exceed range end " + lastInclusive + "."); + } + final BitSet members = new BitSet(); + members.set(firstInclusive, lastInclusive + 1); + return new CodePointSet(members); + } + + /** + * Loads the code points declared under one section of a user definitions file. + * + *

The format is line oriented and parsed with simple cursor scanning, not a regular + * expression: a {@code [name]} line opens a section; a {@code #} begins a comment that runs to + * end of line; each remaining line is a single hex code point ({@code U+00A0}, {@code 0x00A0}, + * or {@code 00A0}) or an inclusive range ({@code U+2000-U+200A}). Section names match case + * insensitively. Only entries under the requested section are returned, so one file can carry, + * for example, both {@code [whitespace]} and {@code [dash]} sections.

+ * + * @param definitions The file to read (UTF-8). + * @param section The section whose entries should be loaded. + * @return The code points declared under {@code section}, or an empty set if the section is + * absent. + * @throws IOException Thrown if the file cannot be read. + * @throws IllegalArgumentException Thrown if a line is malformed, naming the offending line. + */ + public static CodePointSet fromFile(Path definitions, String section) throws IOException { + Objects.requireNonNull(definitions, "definitions"); + return parse(Files.readAllLines(definitions, StandardCharsets.UTF_8), section); + } + + // Package visible so the parser can be exercised directly, without a temporary file. + static CodePointSet parse(List lines, String section) { + Objects.requireNonNull(section, "section"); + final String wanted = section.trim().toLowerCase(Locale.ROOT); + final BitSet members = new BitSet(); + String current = null; + + for (int i = 0; i < lines.size(); i++) { + final String raw = lines.get(i); + final int lineNumber = i + 1; + final String line = stripComment(raw).strip(); + if (line.isEmpty()) { + continue; + } + if (line.charAt(0) == '[') { + if (line.length() < 3 || line.charAt(line.length() - 1) != ']') { + throw malformed("section header", lineNumber, raw); + } + current = line.substring(1, line.length() - 1).strip().toLowerCase(Locale.ROOT); + continue; + } + if (current == null) { + throw new IllegalArgumentException("Code point entry before any [section] header on line " + + lineNumber + ": " + raw); + } + if (wanted.equals(current)) { + addEntry(members, line, lineNumber, raw); + } + } + + return new CodePointSet(members); + } + + private static void addEntry(BitSet members, String line, int lineNumber, String raw) { + final int separator = line.indexOf('-'); + if (separator < 0) { + members.set(parseCodePoint(line, lineNumber, raw)); + return; + } + final int low = parseCodePoint(line.substring(0, separator).strip(), lineNumber, raw); + final int high = parseCodePoint(line.substring(separator + 1).strip(), lineNumber, raw); + if (low > high) { + throw new IllegalArgumentException("Descending code point range on line " + + lineNumber + ": " + raw); + } + members.set(low, high + 1); + } + + private static int parseCodePoint(String token, int lineNumber, String raw) { + String hex = token; + if (hex.length() >= 2) { + final String prefix = hex.substring(0, 2).toLowerCase(Locale.ROOT); + if (prefix.equals("u+") || prefix.equals("0x")) { + hex = hex.substring(2); + } + } + if (hex.isEmpty()) { + throw malformed("code point", lineNumber, raw); + } + final int codePoint; + try { + codePoint = Integer.parseInt(hex, 16); + } catch (NumberFormatException e) { + throw new IllegalArgumentException("Invalid hex code point '" + token + "' on line " + + lineNumber + ": " + raw, e); + } + if (codePoint < 0 || codePoint > Character.MAX_CODE_POINT) { + throw new IllegalArgumentException("Code point out of range on line " + + lineNumber + ": " + raw); + } + return codePoint; + } + + private static String stripComment(String raw) { + final int hash = raw.indexOf('#'); + return hash < 0 ? raw : raw.substring(0, hash); + } + + private static IllegalArgumentException malformed(String what, int lineNumber, String raw) { + return new IllegalArgumentException("Malformed " + what + " on line " + lineNumber + ": " + raw); + } + + private static void requireValid(int codePoint) { + if (codePoint < 0 || codePoint > Character.MAX_CODE_POINT) { + throw new IllegalArgumentException("Not a Unicode code point: " + codePoint); + } + } + + /** + * Tests membership. + * + * @param codePoint The code point to test. Out-of-range values return {@code false}. + * @return {@code true} if the code point is in this set. + */ + public boolean contains(int codePoint) { + return codePoint >= 0 && codePoint <= Character.MAX_CODE_POINT && members.get(codePoint); + } + + /** + * Returns a new set containing every code point in this set or {@code other}. + * + * @param other The set to union with. + * @return The union, a new set; neither input is modified. + */ + public CodePointSet union(CodePointSet other) { + Objects.requireNonNull(other, "other"); + final BitSet merged = (BitSet) members.clone(); + merged.or(other.members); + return new CodePointSet(merged); + } + + /** {@return the number of code points in this set} */ + public int size() { + return members.cardinality(); + } + + /** {@return whether this set is empty} */ + public boolean isEmpty() { + return members.isEmpty(); + } + + /** {@return the member code points, in ascending order} */ + public int[] toArray() { + return members.stream().toArray(); + } + + @Override + public boolean equals(Object o) { + return o instanceof CodePointSet other && members.equals(other.members); + } + + @Override + public int hashCode() { + return members.hashCode(); + } +} diff --git a/opennlp-api/src/main/java/opennlp/tools/util/normalizer/NormalizedText.java b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/NormalizedText.java new file mode 100644 index 000000000..87678d741 --- /dev/null +++ b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/NormalizedText.java @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +/** + * The result of a normalization that keeps the original text alongside the normalized form. + * + *

The original is the source of truth (display, offsets, language-specific analysis); the + * normalized form is a derived view tuned for matching and search. The {@link OffsetMap} ties the + * two together so a position in the normalized text can be reported against the original.

+ * + * @param original The untouched source text. + * @param normalized The normalized text. + * @param offsets The mapping between normalized and original character offsets. + */ +public record NormalizedText(CharSequence original, String normalized, OffsetMap offsets) { + + /** + * Maps a normalized character offset back to the original text. + * + * @param normalizedOffset An offset in {@code [0, normalized().length()]}. + * @return The corresponding original character offset. + */ + public int toOriginalOffset(int normalizedOffset) { + return offsets.toOriginalOffset(normalizedOffset); + } + + /** + * Maps an original character offset forward to the normalized text. + * + * @param originalOffset An offset in {@code [0, original().length()]}. + * @return The corresponding normalized character offset. + */ + public int toNormalizedOffset(int originalOffset) { + return offsets.toNormalizedOffset(originalOffset); + } +} diff --git a/opennlp-api/src/main/java/opennlp/tools/util/normalizer/OffsetMap.java b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/OffsetMap.java new file mode 100644 index 000000000..24fa558cf --- /dev/null +++ b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/OffsetMap.java @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.Arrays; + +/** + * A mapping between character offsets in a normalized string and the original text it came from. + * + *

Normalization that collapses runs or substitutes supplementary characters changes string + * length, so an offset into the normalized form no longer lines up with the original. This map + * records, for every normalized character, the original character offset it was produced from, + * which lets a match found in the normalized form be reported in original coordinates.

+ * + *

The internal mapping is non-decreasing, so {@link #toOriginalOffset(int)} is a direct array + * read (O(1)) and {@link #toNormalizedOffset(int)} is a binary search (O(log n)). The map is + * built in the same single cursor pass that produces the normalized text, via {@link Builder}.

+ */ +public final class OffsetMap { + + // normalizedToOriginal[k] is the original char offset that produced normalized char k. + // It has one extra trailing slot mapping the end of the normalized text to the end of the + // original text, so offsets in [0, normalizedLength] are all valid. + private final int[] normalizedToOriginal; + private final int originalLength; + + private OffsetMap(int[] normalizedToOriginal, int originalLength) { + this.normalizedToOriginal = normalizedToOriginal; + this.originalLength = originalLength; + } + + /** + * Maps a normalized character offset back to the original text. + * + * @param normalizedOffset An offset in {@code [0, normalizedLength]}. + * @return The corresponding original character offset. + * @throws IndexOutOfBoundsException Thrown if {@code normalizedOffset} is out of range. + */ + public int toOriginalOffset(int normalizedOffset) { + if (normalizedOffset < 0 || normalizedOffset >= normalizedToOriginal.length) { + throw new IndexOutOfBoundsException("normalized offset " + normalizedOffset + + " is outside [0, " + normalizedLength() + "]"); + } + return normalizedToOriginal[normalizedOffset]; + } + + /** + * Maps an original character offset forward to the normalized text. + * + *

Returns the first normalized offset whose source is at or after {@code originalOffset}. + * When several original characters collapse to one normalized character, they all map to that + * single normalized offset.

+ * + * @param originalOffset An offset in {@code [0, originalLength]}. + * @return The corresponding normalized character offset. + * @throws IndexOutOfBoundsException Thrown if {@code originalOffset} is out of range. + */ + public int toNormalizedOffset(int originalOffset) { + if (originalOffset < 0 || originalOffset > originalLength) { + throw new IndexOutOfBoundsException("original offset " + originalOffset + + " is outside [0, " + originalLength + "]"); + } + int low = 0; + int high = normalizedToOriginal.length - 1; + int answer = normalizedToOriginal.length - 1; + while (low <= high) { + final int mid = (low + high) >>> 1; + if (normalizedToOriginal[mid] >= originalOffset) { + answer = mid; + high = mid - 1; + } else { + low = mid + 1; + } + } + return answer; + } + + /** {@return the length of the normalized text this map was built for} */ + public int normalizedLength() { + return normalizedToOriginal.length - 1; + } + + /** {@return the length of the original text this map was built for} */ + public int originalLength() { + return originalLength; + } + + /** + * Builds an {@link OffsetMap} incrementally during a normalization pass. Call {@link #map(int)} + * once for each character appended to the normalized output, then {@link #build(int)} once. + */ + public static final class Builder { + + private int[] buffer = new int[16]; + private int length; + + /** + * Records the original character offset that produced the next normalized character. + * + * @param originalOffset The source offset in the original text. + */ + public void map(int originalOffset) { + if (length == buffer.length) { + buffer = Arrays.copyOf(buffer, buffer.length * 2); + } + buffer[length++] = originalOffset; + } + + /** + * Finalizes the map. + * + * @param originalLength The length of the original text (used as the trailing sentinel). + * @return The immutable {@link OffsetMap}. + */ + public OffsetMap build(int originalLength) { + final int[] mapping = Arrays.copyOf(buffer, length + 1); + mapping[length] = originalLength; + return new OffsetMap(mapping, originalLength); + } + } +} diff --git a/opennlp-api/src/main/java/opennlp/tools/util/normalizer/UnicodeDash.java b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/UnicodeDash.java new file mode 100644 index 000000000..7ac3ea829 --- /dev/null +++ b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/UnicodeDash.java @@ -0,0 +1,189 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.ArrayList; +import java.util.BitSet; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +/** + * Reference data for Unicode dashes, plus O(1) membership lookups. + * + *

This is a static, immutable table of every code point that carries the Unicode {@code Dash} + * property (Unicode Character Database, {@code PropList.txt}). The set is broader than the + * {@code Pd} (dash punctuation) general category: it also includes the swung dash ({@code Po}) + * and the mathematical minus signs ({@code Sm}). Java offers no {@code Dash} predicate and + * {@code \p{Pd}} would miss the {@code Sm} and {@code Po} members, which is why the set is kept + * here explicitly.

+ * + *

Two distinctions matter for normalization:

+ *
    + *
  • The three mathematical minus signs ({@code U+207B}, {@code U+208B}, {@code U+2212}, all + * category {@code Sm}) are excluded from {@link #defaultDashCodePoints()} because flattening + * them to {@code U+002D} can change mathematical meaning. They remain available through + * {@link #codePoints()} for callers that opt in.
  • + *
  • {@code U+00AD} SOFT HYPHEN is deliberately absent: it is a format character + * ({@code White_Space=no}, {@code Dash=no}), an invisible line-break hint, and must not be + * turned into a visible hyphen.
  • + *
+ */ +public final class UnicodeDash { + + /** The canonical ASCII dash that dashes are normalized to: {@code U+002D} HYPHEN-MINUS. */ + public static final int HYPHEN_MINUS = 0x002D; + + /** The Unicode general category of a dash code point. */ + public enum Category { + /** {@code Pd} - dash punctuation. */ + Pd, + /** {@code Po} - other punctuation (the swung dash). */ + Po, + /** {@code Sm} - math symbol (the minus signs). */ + Sm + } + + /** + * One Unicode dash code point and its reference attributes. + * + * @param codePoint The Unicode code point. + * @param name The Unicode character name, lower cased. + * @param category The Unicode general {@link Category category}. + */ + public record DashCharacter(int codePoint, String name, Category category) { + + /** {@return whether this is a mathematical minus sign (category {@code Sm})} */ + public boolean isMathematical() { + return category == Category.Sm; + } + + /** {@return whether this code point is outside the Basic Multilingual Plane} */ + public boolean isSupplementary() { + return codePoint > 0xFFFF; + } + + /** {@return the {@code U+XXXX} notation for this code point} */ + public String toUnicodeNotation() { + return String.format("U+%04X", codePoint); + } + } + + private static final List DASHES = List.of( + new DashCharacter(0x002D, "hyphen-minus", Category.Pd), + new DashCharacter(0x058A, "armenian hyphen", Category.Pd), + new DashCharacter(0x05BE, "hebrew punctuation maqaf", Category.Pd), + new DashCharacter(0x1400, "canadian syllabics hyphen", Category.Pd), + new DashCharacter(0x1806, "mongolian todo soft hyphen", Category.Pd), + new DashCharacter(0x2010, "hyphen", Category.Pd), + new DashCharacter(0x2011, "non-breaking hyphen", Category.Pd), + new DashCharacter(0x2012, "figure dash", Category.Pd), + new DashCharacter(0x2013, "en dash", Category.Pd), + new DashCharacter(0x2014, "em dash", Category.Pd), + new DashCharacter(0x2015, "horizontal bar", Category.Pd), + new DashCharacter(0x2053, "swung dash", Category.Po), + new DashCharacter(0x207B, "superscript minus", Category.Sm), + new DashCharacter(0x208B, "subscript minus", Category.Sm), + new DashCharacter(0x2212, "minus sign", Category.Sm), + new DashCharacter(0x2E17, "double oblique hyphen", Category.Pd), + new DashCharacter(0x2E1A, "hyphen with diaeresis", Category.Pd), + new DashCharacter(0x2E3A, "two-em dash", Category.Pd), + new DashCharacter(0x2E3B, "three-em dash", Category.Pd), + new DashCharacter(0x2E40, "double hyphen", Category.Pd), + new DashCharacter(0x2E5D, "oblique hyphen", Category.Pd), + new DashCharacter(0x301C, "wave dash", Category.Pd), + new DashCharacter(0x3030, "wavy dash", Category.Pd), + new DashCharacter(0x30A0, "katakana-hiragana double hyphen", Category.Pd), + new DashCharacter(0xFE31, "presentation form for vertical em dash", Category.Pd), + new DashCharacter(0xFE32, "presentation form for vertical en dash", Category.Pd), + new DashCharacter(0xFE58, "small em dash", Category.Pd), + new DashCharacter(0xFE63, "small hyphen-minus", Category.Pd), + new DashCharacter(0xFF0D, "fullwidth hyphen-minus", Category.Pd), + new DashCharacter(0x10D6E, "garay hyphen", Category.Pd), + new DashCharacter(0x10EAD, "yezidi hyphenation mark", Category.Pd)); + + private static final Map BY_CODE_POINT = new HashMap<>(); + private static final BitSet MEMBERSHIP = new BitSet(); + private static final int[] CODE_POINTS = new int[DASHES.size()]; + private static final List MATHEMATICAL = new ArrayList<>(); + private static final int[] DEFAULT_CODE_POINTS; + + static { + final List defaults = new ArrayList<>(); + for (int i = 0; i < DASHES.size(); i++) { + final DashCharacter dash = DASHES.get(i); + BY_CODE_POINT.put(dash.codePoint(), dash); + MEMBERSHIP.set(dash.codePoint()); + CODE_POINTS[i] = dash.codePoint(); + if (dash.isMathematical()) { + MATHEMATICAL.add(dash); + } else { + defaults.add(dash.codePoint()); + } + } + DEFAULT_CODE_POINTS = defaults.stream().mapToInt(Integer::intValue).toArray(); + } + + private UnicodeDash() { + } + + /** + * Tests whether a code point carries the Unicode {@code Dash} property. + * + * @param codePoint The code point to test. Out-of-range values return {@code false}. + * @return {@code true} if the code point is one of the Unicode dash characters. + */ + public static boolean isDash(int codePoint) { + return codePoint >= 0 && codePoint <= Character.MAX_CODE_POINT && MEMBERSHIP.get(codePoint); + } + + /** + * Looks up the reference entry for a dash code point. + * + * @param codePoint The code point. + * @return The {@link DashCharacter}, or {@link Optional#empty()} if it is not a dash. + */ + public static Optional byCodePoint(int codePoint) { + return Optional.ofNullable(BY_CODE_POINT.get(codePoint)); + } + + /** {@return all Unicode dash characters, in ascending code point order} */ + public static List all() { + return DASHES; + } + + /** {@return the mathematical minus signs, excluded from the default normalization set} */ + public static List mathematical() { + return List.copyOf(MATHEMATICAL); + } + + /** {@return all dash code points, in ascending order, including the mathematical minus signs} */ + public static int[] codePoints() { + return CODE_POINTS.clone(); + } + + /** + * {@return the dash code points used for normalization by default, in ascending order} + * + *

This is every dash except the mathematical minus signs, so flattening to + * {@link #HYPHEN_MINUS} does not silently rewrite mathematics.

+ */ + public static int[] defaultDashCodePoints() { + return DEFAULT_CODE_POINTS.clone(); + } +} diff --git a/opennlp-api/src/main/java/opennlp/tools/util/normalizer/UnicodeWhitespace.java b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/UnicodeWhitespace.java new file mode 100644 index 000000000..3712f0906 --- /dev/null +++ b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/UnicodeWhitespace.java @@ -0,0 +1,242 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.ArrayList; +import java.util.BitSet; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +/** + * Reference data for Unicode whitespace, plus O(1) membership lookups. + * + *

This is a static, immutable table of the {@code 25} code points that carry the Unicode + * {@code White_Space} property, and the related {@code 6} code points that are commonly mistaken + * for whitespace but carry {@code White_Space=no} (zero-width and other format characters). + * The data mirrors the tables in + * Whitespace character + * and the Unicode Character Database ({@code PropList.txt}).

+ * + *

The membership test is deliberately built from this explicit table rather than from + * {@link Character#isWhitespace(int)} or {@link Character#isSpaceChar(int)}, both of which + * disagree with the Unicode {@code White_Space} property. {@code Character.isWhitespace} + * excludes the non-breaking spaces and {@code NEL} but includes the information-separator + * controls {@code U+001C}-{@code U+001F}; {@code Character.isSpaceChar} excludes tab, newline, + * and the other line breaks. {@link #isWhitespace(int)} matches the standard exactly.

+ */ +public final class UnicodeWhitespace { + + /** Unicode general category for a whitespace or related code point. */ + public enum Category { + /** {@code Cc} - control. */ + Cc, + /** {@code Zs} - space separator. */ + Zs, + /** {@code Zl} - line separator. */ + Zl, + /** {@code Zp} - paragraph separator. */ + Zp, + /** {@code Cf} - format (the related, non-whitespace code points). */ + Cf + } + + /** Line-breaking behavior, mirroring the "Notes" column of the reference table. */ + public enum Breaking { + /** A break opportunity, but not a forced line break (e.g. {@code SPACE}). */ + MAY_BREAK, + /** A forced line or paragraph break (e.g. {@code LF}, {@code LINE SEPARATOR}). */ + LINE_BREAK, + /** A space that suppresses line breaking (e.g. {@code NO-BREAK SPACE}). */ + NON_BREAKING + } + + /** + * One Unicode whitespace code point and its reference attributes. + * + * @param codePoint The Unicode code point. + * @param name The Unicode character name, lower cased as in the reference table. + * @param abbreviation The common abbreviation (for example {@code NBSP}), or {@code ""} if none. + * @param category The Unicode general {@link Category category}. + * @param breaking The line-{@link Breaking breaking} behavior. + */ + public record WhitespaceCharacter(int codePoint, String name, String abbreviation, + Category category, Breaking breaking) { + + /** {@return whether this code point forces a line or paragraph break} */ + public boolean isLineBreak() { + return breaking == Breaking.LINE_BREAK; + } + + /** {@return whether this is a non-breaking space} */ + public boolean isNonBreaking() { + return breaking == Breaking.NON_BREAKING; + } + + /** {@return the {@code U+XXXX} notation for this code point} */ + public String toUnicodeNotation() { + return String.format("U+%04X", codePoint); + } + } + + /** + * One related code point that is commonly confused with whitespace but is not + * ({@code White_Space=no}). These are format characters and must not be treated as, or + * normalized like, whitespace. + * + * @param codePoint The Unicode code point. + * @param name The Unicode character name, lower cased as in the reference table. + * @param abbreviation The common abbreviation (for example {@code BOM}), or {@code ""} if none. + * @param note A short description of what the character actually does. + */ + public record RelatedCharacter(int codePoint, String name, String abbreviation, String note) { + + /** {@return the {@code U+XXXX} notation for this code point} */ + public String toUnicodeNotation() { + return String.format("U+%04X", codePoint); + } + } + + private static final List WHITESPACE = List.of( + new WhitespaceCharacter(0x0009, "character tabulation", "HT", Category.Cc, Breaking.MAY_BREAK), + new WhitespaceCharacter(0x000A, "line feed", "LF", Category.Cc, Breaking.LINE_BREAK), + new WhitespaceCharacter(0x000B, "line tabulation", "VT", Category.Cc, Breaking.LINE_BREAK), + new WhitespaceCharacter(0x000C, "form feed", "FF", Category.Cc, Breaking.LINE_BREAK), + new WhitespaceCharacter(0x000D, "carriage return", "CR", Category.Cc, Breaking.LINE_BREAK), + new WhitespaceCharacter(0x0020, "space", "", Category.Zs, Breaking.MAY_BREAK), + new WhitespaceCharacter(0x0085, "next line", "NEL", Category.Cc, Breaking.LINE_BREAK), + new WhitespaceCharacter(0x00A0, "no-break space", "NBSP", Category.Zs, Breaking.NON_BREAKING), + new WhitespaceCharacter(0x1680, "ogham space mark", "", Category.Zs, Breaking.MAY_BREAK), + new WhitespaceCharacter(0x2000, "en quad", "", Category.Zs, Breaking.MAY_BREAK), + new WhitespaceCharacter(0x2001, "em quad", "", Category.Zs, Breaking.MAY_BREAK), + new WhitespaceCharacter(0x2002, "en space", "", Category.Zs, Breaking.MAY_BREAK), + new WhitespaceCharacter(0x2003, "em space", "", Category.Zs, Breaking.MAY_BREAK), + new WhitespaceCharacter(0x2004, "three-per-em space", "", Category.Zs, Breaking.MAY_BREAK), + new WhitespaceCharacter(0x2005, "four-per-em space", "", Category.Zs, Breaking.MAY_BREAK), + new WhitespaceCharacter(0x2006, "six-per-em space", "", Category.Zs, Breaking.MAY_BREAK), + new WhitespaceCharacter(0x2007, "figure space", "", Category.Zs, Breaking.NON_BREAKING), + new WhitespaceCharacter(0x2008, "punctuation space", "", Category.Zs, Breaking.MAY_BREAK), + new WhitespaceCharacter(0x2009, "thin space", "", Category.Zs, Breaking.MAY_BREAK), + new WhitespaceCharacter(0x200A, "hair space", "", Category.Zs, Breaking.MAY_BREAK), + new WhitespaceCharacter(0x2028, "line separator", "", Category.Zl, Breaking.LINE_BREAK), + new WhitespaceCharacter(0x2029, "paragraph separator", "", Category.Zp, Breaking.LINE_BREAK), + new WhitespaceCharacter(0x202F, "narrow no-break space", "NNBSP", Category.Zs, + Breaking.NON_BREAKING), + new WhitespaceCharacter(0x205F, "medium mathematical space", "MMSP", Category.Zs, + Breaking.MAY_BREAK), + new WhitespaceCharacter(0x3000, "ideographic space", "", Category.Zs, Breaking.MAY_BREAK)); + + private static final List LOOKALIKES = List.of( + new RelatedCharacter(0x180E, "mongolian vowel separator", "MVS", + "format character; narrow space for Mongolian"), + new RelatedCharacter(0x200B, "zero width space", "ZWSP", + "format; word boundary indicator, no visible width"), + new RelatedCharacter(0x200C, "zero width non-joiner", "ZWNJ", + "format; prevents character connection"), + new RelatedCharacter(0x200D, "zero width joiner", "ZWJ", + "format; enables character connection"), + new RelatedCharacter(0x2060, "word joiner", "WJ", + "format; non-breaking, no line break point"), + new RelatedCharacter(0xFEFF, "zero width no-break space", "BOM", + "format; byte order mark")); + + private static final Map BY_CODE_POINT = new HashMap<>(); + private static final BitSet MEMBERSHIP = new BitSet(); + private static final BitSet LOOKALIKE_MEMBERSHIP = new BitSet(); + private static final int[] CODE_POINTS = new int[WHITESPACE.size()]; + private static final List LINE_BREAKS = new ArrayList<>(); + private static final List NON_BREAKING = new ArrayList<>(); + + static { + for (int i = 0; i < WHITESPACE.size(); i++) { + final WhitespaceCharacter ws = WHITESPACE.get(i); + BY_CODE_POINT.put(ws.codePoint(), ws); + MEMBERSHIP.set(ws.codePoint()); + CODE_POINTS[i] = ws.codePoint(); + if (ws.isLineBreak()) { + LINE_BREAKS.add(ws); + } + if (ws.isNonBreaking()) { + NON_BREAKING.add(ws); + } + } + for (final RelatedCharacter related : LOOKALIKES) { + LOOKALIKE_MEMBERSHIP.set(related.codePoint()); + } + } + + private UnicodeWhitespace() { + } + + /** + * Tests whether a code point carries the Unicode {@code White_Space} property. + * + * @param codePoint The code point to test. Out-of-range values (negative or beyond + * {@link Character#MAX_CODE_POINT}) simply return {@code false}. + * @return {@code true} if the code point is one of the {@code 25} Unicode whitespace characters. + */ + public static boolean isWhitespace(int codePoint) { + return codePoint >= 0 && codePoint <= Character.MAX_CODE_POINT && MEMBERSHIP.get(codePoint); + } + + /** + * Tests whether a code point is one of the related, non-whitespace look-alike format characters. + * + * @param codePoint The code point to test. + * @return {@code true} if the code point is in the {@link #lookalikes() look-alike} set. + */ + public static boolean isLookalike(int codePoint) { + return codePoint >= 0 && codePoint <= Character.MAX_CODE_POINT + && LOOKALIKE_MEMBERSHIP.get(codePoint); + } + + /** + * Looks up the reference entry for a whitespace code point. + * + * @param codePoint The code point. + * @return The {@link WhitespaceCharacter}, or {@link Optional#empty()} if it is not whitespace. + */ + public static Optional byCodePoint(int codePoint) { + return Optional.ofNullable(BY_CODE_POINT.get(codePoint)); + } + + /** {@return the {@code 25} Unicode whitespace characters, in ascending code point order} */ + public static List all() { + return WHITESPACE; + } + + /** {@return the related, non-whitespace look-alike format characters} */ + public static List lookalikes() { + return LOOKALIKES; + } + + /** {@return the whitespace characters that force a line or paragraph break} */ + public static List lineBreaks() { + return List.copyOf(LINE_BREAKS); + } + + /** {@return the non-breaking whitespace characters} */ + public static List nonBreaking() { + return List.copyOf(NON_BREAKING); + } + + /** {@return the whitespace code points, in ascending order} */ + public static int[] codePoints() { + return CODE_POINTS.clone(); + } +} diff --git a/opennlp-api/src/test/java/opennlp/tools/util/normalizer/CharClassTest.java b/opennlp-api/src/test/java/opennlp/tools/util/normalizer/CharClassTest.java new file mode 100644 index 000000000..5e2a42ba6 --- /dev/null +++ b/opennlp-api/src/test/java/opennlp/tools/util/normalizer/CharClassTest.java @@ -0,0 +1,292 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.List; + +import org.junit.jupiter.api.Test; + +import opennlp.tools.util.Span; +import opennlp.tools.util.normalizer.UnicodeWhitespace.WhitespaceCharacter; + +import static org.junit.jupiter.api.Assertions.assertArrayEquals; +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.junit.jupiter.api.Assertions.assertTrue; + +public class CharClassTest { + + private static final CharClass WS = CharClass.whitespace(); + private static final CharClass DASH = CharClass.dashes(); + + // Non-ASCII test characters are built from code points (no literal glyphs, no Unicode escapes) + // so the source stays pure ASCII and the intent is explicit. Tab and newline use \t and \n. + private static final String NBSP = cp(0x00A0); + private static final String IDEOGRAPHIC = cp(0x3000); + private static final String EM_DASH = cp(0x2014); + private static final String EN_DASH = cp(0x2013); + private static final String FIGURE_DASH = cp(0x2012); + private static final String MINUS_SIGN = cp(0x2212); + private static final String YEZIDI_HYPHEN = cp(0x10EAD); + private static final String GRINNING_FACE = cp(0x1F600); + + private static String cp(int codePoint) { + return new String(Character.toChars(codePoint)); + } + + private static CodePointSet lineBreaks() { + return CodePointSet.of(UnicodeWhitespace.lineBreaks().stream() + .mapToInt(WhitespaceCharacter::codePoint).toArray()); + } + + // --- membership -------------------------------------------------------------------------- + + @Test + void testWhitespacePresetMembership() { + assertTrue(WS.contains(0x0020)); + assertTrue(WS.contains(0x0009)); + assertTrue(WS.contains(0x00A0)); + assertTrue(WS.contains(0x3000)); + assertTrue(WS.contains(0x2028)); + assertFalse(WS.contains('a')); + assertFalse(WS.contains(0x200B), "zero width space is not whitespace"); + } + + @Test + void testDashPresetMembershipExcludesMathMinus() { + assertTrue(DASH.contains(0x2014)); + assertTrue(DASH.contains(0x2013)); + assertTrue(DASH.contains(0xFF0D)); + assertFalse(DASH.contains(0x2212), "math minus is excluded by default"); + assertFalse(DASH.contains('a')); + } + + // --- normalize / collapse ---------------------------------------------------------------- + + @Test + void testNormalizeReplacesEachMemberOneForOne() { + assertEquals("a b", WS.normalize("a" + NBSP + IDEOGRAPHIC + "b")); + assertEquals("well-known", DASH.normalize("well" + EM_DASH + "known")); + assertEquals("a-b-c", DASH.normalize("a" + EN_DASH + "b" + FIGURE_DASH + "c")); + } + + @Test + void testNormalizeLeavesMathMinusUntouched() { + assertEquals("5" + MINUS_SIGN + "3", DASH.normalize("5" + MINUS_SIGN + "3")); + } + + @Test + void testCollapseMergesRuns() { + assertEquals("a b", WS.collapse("a" + NBSP + IDEOGRAPHIC + "b")); + assertEquals(" a b ", WS.collapse(" a\t\tb ")); + assertEquals("a-b", DASH.collapse("a" + EM_DASH + EN_DASH + EM_DASH + "b")); + } + + @Test + void testNormalizeAndCollapseHandleSupplementaryMembers() { + assertEquals("x-y", DASH.normalize("x" + YEZIDI_HYPHEN + "y")); + assertEquals("x-y", DASH.collapse("x" + YEZIDI_HYPHEN + YEZIDI_HYPHEN + "y")); + } + + @Test + void testEmptyAndAllMemberInputs() { + assertEquals("", WS.normalize("")); + assertEquals("", WS.collapse("")); + assertEquals("", WS.trim("")); + assertEquals("", WS.removeAll("")); + assertArrayEquals(new String[0], WS.split("")); + assertArrayEquals(new String[0], WS.split(" " + IDEOGRAPHIC)); + } + + // --- squish (collapsePreserving) --------------------------------------------------------- + + @Test + void testCollapsePreservingKeepsLineBreaks() { + final CodePointSet keep = lineBreaks(); + assertEquals("a\nb", WS.collapsePreserving("a\n\n\t\tb", keep, '\n')); + assertEquals("a b", WS.collapsePreserving("a \t b", keep, '\n')); + assertEquals("a\nb\nc", WS.collapsePreserving("a\n \tb \nc", keep, '\n')); + } + + // --- trim / removeAll -------------------------------------------------------------------- + + @Test + void testTrim() { + assertEquals("hello", WS.trim("\t hello" + IDEOGRAPHIC + IDEOGRAPHIC)); + assertEquals("noedge", WS.trim("noedge")); + assertEquals("", WS.trim(" ")); + assertEquals("a b", WS.trim(" a b "), "interior whitespace is preserved"); + } + + @Test + void testRemoveAll() { + assertEquals("abcd", WS.removeAll("a b\tc d")); + } + + // --- split / splitSpans ------------------------------------------------------------------ + + @Test + void testSplitOnUnicodeWhitespace() { + assertArrayEquals(new String[] {"one", "two", "three"}, + WS.split("one two" + IDEOGRAPHIC + IDEOGRAPHIC + "three")); + assertArrayEquals(new String[] {"a", "b"}, WS.split(" a b ")); + } + + @Test + void testSplitSpansCarryOriginalOffsets() { + final String text = "one two"; + final List spans = WS.splitSpans(text); + assertEquals(2, spans.size()); + assertEquals(0, spans.get(0).getStart()); + assertEquals(3, spans.get(0).getEnd()); + assertEquals("one", spans.get(0).getCoveredText(text).toString()); + assertEquals(4, spans.get(1).getStart()); + assertEquals(7, spans.get(1).getEnd()); + assertEquals("two", spans.get(1).getCoveredText(text).toString()); + } + + @Test + void testSplitSpansWithSupplementaryToken() { + final String text = "a " + GRINNING_FACE + " b"; + final List spans = WS.splitSpans(text); + assertEquals(3, spans.size()); + assertEquals("a", spans.get(0).getCoveredText(text).toString()); + assertEquals(GRINNING_FACE, spans.get(1).getCoveredText(text).toString()); + assertEquals("b", spans.get(2).getCoveredText(text).toString()); + } + + // --- custom classes ---------------------------------------------------------------------- + + @Test + void testCustomClass() { + final CharClass vowelO = CharClass.of(CodePointSet.of('o'), '0'); + assertEquals("f00 bar", vowelO.normalize("foo bar")); + assertEquals("f0", vowelO.collapse("foo")); + } + + @Test + void testWithAdditionalExtendsWithoutMutatingOriginal() { + final CharClass extended = WS.withAdditional(CodePointSet.of('_')); + assertTrue(extended.contains('_')); + assertTrue(extended.contains(0x0020)); + assertEquals("a b c", extended.normalize("a_b c")); + assertFalse(WS.contains('_'), "the preset must be unchanged"); + } + + @Test + void testOfRejectsInvalidReplacement() { + assertThrows(IllegalArgumentException.class, + () -> CharClass.of(CodePointSet.of(0x20), -1)); + assertThrows(IllegalArgumentException.class, + () -> CharClass.of(CodePointSet.of(0x20), Character.MAX_CODE_POINT + 1)); + } + + // --- offset-mapped variants -------------------------------------------------------------- + + @Test + void testCollapseMappedOffsets() { + final NormalizedText nt = WS.collapseMapped("a b"); + assertEquals("a b", nt.normalized()); + assertEquals(3, nt.offsets().normalizedLength()); + assertEquals(4, nt.offsets().originalLength()); + + assertEquals(0, nt.toOriginalOffset(0)); + assertEquals(1, nt.toOriginalOffset(1)); + assertEquals(3, nt.toOriginalOffset(2)); + assertEquals(4, nt.toOriginalOffset(3)); + + assertEquals(0, nt.toNormalizedOffset(0)); + assertEquals(1, nt.toNormalizedOffset(1)); + assertEquals(2, nt.toNormalizedOffset(3)); + assertEquals(3, nt.toNormalizedOffset(4)); + } + + @Test + void testNormalizeMappedIsIdentityWhenNothingMatches() { + final NormalizedText nt = WS.normalizeMapped("abc"); + assertEquals("abc", nt.normalized()); + for (int i = 0; i <= 3; i++) { + assertEquals(i, nt.toOriginalOffset(i)); + } + } + + @Test + void testNormalizeMappedPreservesSupplementaryCopyOffsets() { + final String text = "a" + GRINNING_FACE + "b"; + final NormalizedText nt = WS.normalizeMapped(text); + assertEquals(text, nt.normalized()); + for (int i = 0; i <= text.length(); i++) { + assertEquals(i, nt.toOriginalOffset(i)); + } + } + + @Test + void testNormalizeMappedCollapsesSupplementaryMemberToOneChar() { + final String text = "x" + YEZIDI_HYPHEN + "y"; + final NormalizedText nt = DASH.normalizeMapped(text); + assertEquals("x-y", nt.normalized()); + assertEquals(0, nt.toOriginalOffset(0)); + assertEquals(1, nt.toOriginalOffset(1)); + assertEquals(3, nt.toOriginalOffset(2)); + assertEquals(4, nt.toOriginalOffset(3)); + } + + @Test + void testOffsetMapRejectsOutOfRange() { + final OffsetMap map = WS.collapseMapped("ab").offsets(); + assertThrows(IndexOutOfBoundsException.class, () -> map.toOriginalOffset(-1)); + assertThrows(IndexOutOfBoundsException.class, + () -> map.toOriginalOffset(map.normalizedLength() + 1)); + assertThrows(IndexOutOfBoundsException.class, () -> map.toNormalizedOffset(-1)); + assertThrows(IndexOutOfBoundsException.class, + () -> map.toNormalizedOffset(map.originalLength() + 1)); + } + + @Test + void testAccessorsExposeMembersAndReplacement() { + assertEquals(0x0020, WS.replacement()); + assertEquals('-', DASH.replacement()); + assertTrue(WS.members().contains(0x00A0)); + assertFalse(WS.members().contains('a')); + } + + @Test + void testOffsetMapBuilderGrowsBeyondInitialCapacity() { + // 26 output characters force the OffsetMap builder past its initial 16-entry buffer. + final String text = "abcdefghijklmnopqrstuvwxyz"; + final NormalizedText nt = WS.normalizeMapped(text); + assertEquals(text, nt.normalized()); + assertEquals(26, nt.offsets().normalizedLength()); + for (int i = 0; i <= text.length(); i++) { + assertEquals(i, nt.toOriginalOffset(i)); + } + } + + @Test + void testNormalizeMappedWithSupplementaryReplacement() { + // A supplementary replacement exercises the two-char substitution path of the offset map. + final int penguin = 0x1F427; + final CharClass toPenguin = CharClass.of(CodePointSet.of(' '), penguin); + final NormalizedText nt = toPenguin.normalizeMapped("a b"); + assertEquals("a" + new String(Character.toChars(penguin)) + "b", nt.normalized()); + assertEquals(0, nt.toOriginalOffset(0)); + assertEquals(1, nt.toOriginalOffset(1)); + assertEquals(1, nt.toOriginalOffset(2)); + assertEquals(2, nt.toOriginalOffset(3)); + } +} diff --git a/opennlp-api/src/test/java/opennlp/tools/util/normalizer/CodePointSetTest.java b/opennlp-api/src/test/java/opennlp/tools/util/normalizer/CodePointSetTest.java new file mode 100644 index 000000000..769cea71f --- /dev/null +++ b/opennlp-api/src/test/java/opennlp/tools/util/normalizer/CodePointSetTest.java @@ -0,0 +1,241 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.io.IOException; +import java.nio.charset.StandardCharsets; +import java.nio.file.Files; +import java.nio.file.Path; +import java.util.List; + +import org.junit.jupiter.api.Test; +import org.junit.jupiter.api.io.TempDir; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.ValueSource; + +import static org.junit.jupiter.api.Assertions.assertArrayEquals; +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.junit.jupiter.api.Assertions.assertTrue; + +public class CodePointSetTest { + + @Test + void testOfContainsExactlyTheGivenCodePoints() { + final CodePointSet set = CodePointSet.of(0x0041, 0x00A0, 0x1F600); + assertTrue(set.contains(0x0041)); + assertTrue(set.contains(0x00A0)); + assertTrue(set.contains(0x1F600)); + assertFalse(set.contains(0x0042)); + assertEquals(3, set.size()); + assertFalse(set.isEmpty()); + } + + @Test + void testToArrayIsAscending() { + final CodePointSet set = CodePointSet.of(0x3000, 0x0009, 0x00A0); + assertArrayEquals(new int[] {0x0009, 0x00A0, 0x3000}, set.toArray()); + } + + @Test + void testOfRangeIsInclusive() { + final CodePointSet set = CodePointSet.ofRange(0x2000, 0x200A); + assertTrue(set.contains(0x2000)); + assertTrue(set.contains(0x2005)); + assertTrue(set.contains(0x200A)); + assertFalse(set.contains(0x1FFF)); + assertFalse(set.contains(0x200B)); + assertEquals(11, set.size()); + } + + @Test + void testOfRangeRejectsDescending() { + assertThrows(IllegalArgumentException.class, () -> CodePointSet.ofRange(0x200A, 0x2000)); + } + + @ParameterizedTest + @ValueSource(ints = {-1, Integer.MIN_VALUE, Character.MAX_CODE_POINT + 1, Integer.MAX_VALUE}) + void testOfRejectsInvalidCodePoints(int codePoint) { + assertThrows(IllegalArgumentException.class, () -> CodePointSet.of(codePoint)); + } + + @ParameterizedTest + @ValueSource(ints = {-1, Integer.MIN_VALUE, Character.MAX_CODE_POINT + 1, Integer.MAX_VALUE}) + void testContainsIsRangeSafe(int codePoint) { + assertFalse(CodePointSet.of(0x0020).contains(codePoint)); + } + + @Test + void testUnionIsNonDestructive() { + final CodePointSet a = CodePointSet.of(0x0041); + final CodePointSet b = CodePointSet.of(0x0042); + final CodePointSet union = a.union(b); + + assertTrue(union.contains(0x0041)); + assertTrue(union.contains(0x0042)); + assertEquals(2, union.size()); + assertFalse(a.contains(0x0042), "left operand must be unchanged"); + assertFalse(b.contains(0x0041), "right operand must be unchanged"); + } + + @Test + void testEqualsAndHashCode() { + assertEquals(CodePointSet.of(0x01, 0x02), CodePointSet.of(0x02, 0x01)); + assertEquals(CodePointSet.of(0x01, 0x02).hashCode(), CodePointSet.of(0x02, 0x01).hashCode()); + assertFalse(CodePointSet.of(0x01).equals(CodePointSet.of(0x02))); + } + + @Test + void testEqualsAgainstOtherTypesAndNull() { + final CodePointSet set = CodePointSet.of(0x20); + assertFalse(set.equals(null)); + assertFalse(set.equals("not a code point set")); + } + + @Test + void testParseAcceptsSingleHexDigit() { + assertTrue(CodePointSet.parse(List.of("[s]", "9"), "s").contains(0x9)); + } + + @Test + void testParseRejectsEmptyCodePointAfterPrefix() { + assertThrows(IllegalArgumentException.class, + () -> CodePointSet.parse(List.of("[s]", "U+"), "s")); + } + + @Test + void testParseRejectsTooShortSectionHeader() { + assertThrows(IllegalArgumentException.class, + () -> CodePointSet.parse(List.of("[]", "U+0041"), "s")); + } + + @Test + void testParseSingleCodePointsRangesCommentsAndBlankLines() { + final List lines = List.of( + "# a whitespace overlay", + "[whitespace]", + "U+00A0 # no-break space", + "0x2028", + "2029", + "", + "U+2000-U+200A # typographic spaces"); + + final CodePointSet set = CodePointSet.parse(lines, "whitespace"); + + assertTrue(set.contains(0x00A0)); + assertTrue(set.contains(0x2028)); + assertTrue(set.contains(0x2029)); + assertTrue(set.contains(0x2000)); + assertTrue(set.contains(0x2007)); + assertTrue(set.contains(0x200A)); + assertFalse(set.contains(0x200B)); + assertEquals(3 + 11, set.size()); + } + + @Test + void testParseReturnsOnlyRequestedSection() { + final List lines = List.of( + "[whitespace]", + "U+00A0", + "[dash]", + "U+2212", + "U+2014"); + + final CodePointSet whitespace = CodePointSet.parse(lines, "whitespace"); + assertTrue(whitespace.contains(0x00A0)); + assertFalse(whitespace.contains(0x2212)); + assertFalse(whitespace.contains(0x2014)); + + final CodePointSet dash = CodePointSet.parse(lines, "dash"); + assertTrue(dash.contains(0x2212)); + assertTrue(dash.contains(0x2014)); + assertFalse(dash.contains(0x00A0)); + } + + @Test + void testParseSectionNameIsCaseInsensitive() { + final List lines = List.of("[WhiteSpace]", "U+00A0"); + assertTrue(CodePointSet.parse(lines, "whitespace").contains(0x00A0)); + assertTrue(CodePointSet.parse(lines, "WHITESPACE").contains(0x00A0)); + } + + @Test + void testParseMissingSectionIsEmpty() { + final List lines = List.of("[whitespace]", "U+00A0"); + assertTrue(CodePointSet.parse(lines, "dash").isEmpty()); + } + + @Test + void testParseRejectsMalformedSectionHeader() { + final List lines = List.of("[whitespace", "U+00A0"); + final IllegalArgumentException e = assertThrows(IllegalArgumentException.class, + () -> CodePointSet.parse(lines, "whitespace")); + assertTrue(e.getMessage().contains("line 1"), e.getMessage()); + } + + @Test + void testParseRejectsInvalidHex() { + final List lines = List.of("[whitespace]", "U+ZZZZ"); + final IllegalArgumentException e = assertThrows(IllegalArgumentException.class, + () -> CodePointSet.parse(lines, "whitespace")); + assertTrue(e.getMessage().contains("line 2"), e.getMessage()); + } + + @Test + void testParseRejectsDescendingRange() { + final List lines = List.of("[whitespace]", "U+200A-U+2000"); + assertThrows(IllegalArgumentException.class, () -> CodePointSet.parse(lines, "whitespace")); + } + + @Test + void testParseRejectsOutOfRangeCodePoint() { + final List lines = List.of("[whitespace]", "U+110000"); + assertThrows(IllegalArgumentException.class, () -> CodePointSet.parse(lines, "whitespace")); + } + + @Test + void testParseRejectsEntryBeforeAnySection() { + final List lines = List.of("U+00A0"); + final IllegalArgumentException e = assertThrows(IllegalArgumentException.class, + () -> CodePointSet.parse(lines, "whitespace")); + assertTrue(e.getMessage().contains("before any [section]"), e.getMessage()); + } + + @Test + void testParseAcceptsAllThreeHexPrefixes() { + final List lines = List.of("[s]", "U+0041", "0x0042", "0043"); + final CodePointSet set = CodePointSet.parse(lines, "s"); + assertTrue(set.contains(0x41)); + assertTrue(set.contains(0x42)); + assertTrue(set.contains(0x43)); + } + + @Test + void testFromFileReadsTheNamedSection(@TempDir Path dir) throws IOException { + final Path file = dir.resolve("delimiters.txt"); + Files.writeString(file, String.join("\n", + "[whitespace]", + "U+00A0", + "[dash]", + "U+2E5D"), StandardCharsets.UTF_8); + + assertTrue(CodePointSet.fromFile(file, "whitespace").contains(0x00A0)); + assertTrue(CodePointSet.fromFile(file, "dash").contains(0x2E5D)); + assertFalse(CodePointSet.fromFile(file, "dash").contains(0x00A0)); + } +} diff --git a/opennlp-api/src/test/java/opennlp/tools/util/normalizer/UnicodeDashTest.java b/opennlp-api/src/test/java/opennlp/tools/util/normalizer/UnicodeDashTest.java new file mode 100644 index 000000000..9d547a980 --- /dev/null +++ b/opennlp-api/src/test/java/opennlp/tools/util/normalizer/UnicodeDashTest.java @@ -0,0 +1,170 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.Arrays; +import java.util.List; +import java.util.Set; +import java.util.stream.Collectors; + +import org.junit.jupiter.api.Test; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.MethodSource; +import org.junit.jupiter.params.provider.ValueSource; + +import opennlp.tools.util.normalizer.UnicodeDash.Category; +import opennlp.tools.util.normalizer.UnicodeDash.DashCharacter; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertNotNull; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.junit.jupiter.api.Assertions.assertTrue; + +public class UnicodeDashTest { + + private static List dashes() { + return UnicodeDash.all(); + } + + // Maps the running JDK's general category to our enum, or null if it cannot be expressed (which + // includes code points the JDK's Unicode version does not yet assign). + private static Category jdkCategory(int codePoint) { + return switch (Character.getType(codePoint)) { + case Character.DASH_PUNCTUATION -> Category.Pd; + case Character.MATH_SYMBOL -> Category.Sm; + case Character.OTHER_PUNCTUATION -> Category.Po; + default -> null; + }; + } + + @Test + void testDashSetHasExactly31() { + assertEquals(31, UnicodeDash.all().size()); + } + + @ParameterizedTest + @MethodSource("dashes") + void testEachDashIsSelfConsistent(DashCharacter dash) { + assertTrue(UnicodeDash.isDash(dash.codePoint()), dash::toUnicodeNotation); + assertEquals(dash, UnicodeDash.byCodePoint(dash.codePoint()).orElseThrow()); + assertNotNull(dash.category()); + assertFalse(dash.name().isBlank()); + } + + @ParameterizedTest + @MethodSource("dashes") + void testCategoryMatchesJdkUnicodeDataWhenAssigned(DashCharacter dash) { + final Category jdk = jdkCategory(dash.codePoint()); + // Skip code points the running JVM's Unicode version does not assign yet (e.g. newer dashes). + if (Character.getType(dash.codePoint()) != Character.UNASSIGNED) { + assertEquals(jdk, dash.category(), dash::toUnicodeNotation); + } + } + + @Test + void testCodePointsAreUniqueAndStrictlyAscending() { + final int[] cps = UnicodeDash.codePoints(); + for (int i = 1; i < cps.length; i++) { + assertTrue(cps[i] > cps[i - 1], "dash code points must be unique and ascending"); + } + } + + @Test + void testMathematicalAreExactlyTheThreeMinusSigns() { + final Set math = UnicodeDash.mathematical().stream() + .map(DashCharacter::codePoint).collect(Collectors.toSet()); + assertEquals(Set.of(0x207B, 0x208B, 0x2212), math); + UnicodeDash.mathematical().forEach(d -> { + assertTrue(d.isMathematical()); + assertEquals(Category.Sm, d.category()); + }); + } + + @Test + void testDefaultDashSetExcludesMathematicalMinusSigns() { + final int[] defaults = UnicodeDash.defaultDashCodePoints(); + assertEquals(UnicodeDash.all().size() - 3, defaults.length); + for (final int codePoint : defaults) { + assertFalse(UnicodeDash.byCodePoint(codePoint).orElseThrow().isMathematical(), + () -> String.format("U+%04X must not be a math minus", codePoint)); + } + assertFalse(Arrays.stream(defaults).anyMatch(cp -> cp == 0x2212)); + } + + @Test + void testHyphenMinusIsTheCanonicalTarget() { + assertEquals(0x002D, UnicodeDash.HYPHEN_MINUS); + assertTrue(UnicodeDash.isDash(0x002D)); + assertEquals(Category.Pd, UnicodeDash.byCodePoint(0x002D).orElseThrow().category()); + } + + @Test + void testSupplementaryDashesArePresent() { + for (final int codePoint : new int[] {0x10D6E, 0x10EAD}) { + assertTrue(UnicodeDash.isDash(codePoint)); + assertTrue(UnicodeDash.byCodePoint(codePoint).orElseThrow().isSupplementary()); + } + } + + @Test + void testBmpDashIsNotSupplementary() { + assertFalse(UnicodeDash.byCodePoint(0x2014).orElseThrow().isSupplementary()); + } + + @Test + void testDashToUnicodeNotation() { + assertEquals("U+2014", UnicodeDash.byCodePoint(0x2014).orElseThrow().toUnicodeNotation()); + assertEquals("U+10EAD", UnicodeDash.byCodePoint(0x10EAD).orElseThrow().toUnicodeNotation()); + } + + @ParameterizedTest + @ValueSource(ints = {0x00AD, 0x002E, 0x0041, 0x0020, 0x007E, 0x1F600}) + void testNonDashesAreNotDashes(int codePoint) { + // Notably U+00AD SOFT HYPHEN is a format character, not a dash, and must not be treated as one. + assertFalse(UnicodeDash.isDash(codePoint)); + } + + @ParameterizedTest + @ValueSource(ints = {-1, Integer.MIN_VALUE, Character.MAX_CODE_POINT + 1, Integer.MAX_VALUE}) + void testIsDashIsRangeSafe(int codePoint) { + assertFalse(UnicodeDash.isDash(codePoint)); + } + + @Test + void testByCodePointUnknownIsEmpty() { + assertTrue(UnicodeDash.byCodePoint('A').isEmpty()); + assertTrue(UnicodeDash.byCodePoint(0x00AD).isEmpty()); + } + + @Test + void testReferenceListIsImmutable() { + assertThrows(UnsupportedOperationException.class, () -> UnicodeDash.all().add(null)); + assertThrows(UnsupportedOperationException.class, () -> UnicodeDash.mathematical().add(null)); + } + + @Test + void testArrayAccessorsReturnDefensiveCopies() { + final int[] all = UnicodeDash.codePoints(); + all[0] = -1; + assertEquals(0x002D, UnicodeDash.codePoints()[0]); + + final int[] defaults = UnicodeDash.defaultDashCodePoints(); + defaults[0] = -1; + assertEquals(0x002D, UnicodeDash.defaultDashCodePoints()[0]); + } +} diff --git a/opennlp-api/src/test/java/opennlp/tools/util/normalizer/UnicodeWhitespaceTest.java b/opennlp-api/src/test/java/opennlp/tools/util/normalizer/UnicodeWhitespaceTest.java new file mode 100644 index 000000000..bd040efc0 --- /dev/null +++ b/opennlp-api/src/test/java/opennlp/tools/util/normalizer/UnicodeWhitespaceTest.java @@ -0,0 +1,239 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.Arrays; +import java.util.List; +import java.util.Set; +import java.util.stream.Collectors; +import java.util.stream.IntStream; + +import org.junit.jupiter.api.Test; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.MethodSource; +import org.junit.jupiter.params.provider.ValueSource; + +import opennlp.tools.util.normalizer.UnicodeWhitespace.Category; +import opennlp.tools.util.normalizer.UnicodeWhitespace.RelatedCharacter; +import opennlp.tools.util.normalizer.UnicodeWhitespace.WhitespaceCharacter; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertNotNull; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.junit.jupiter.api.Assertions.assertTrue; + +public class UnicodeWhitespaceTest { + + private static List whitespace() { + return UnicodeWhitespace.all(); + } + + private static List lookalikes() { + return UnicodeWhitespace.lookalikes(); + } + + // Maps the JDK's Unicode general category to our enum, used as an independent oracle. + private static Category jdkCategory(int codePoint) { + return switch (Character.getType(codePoint)) { + case Character.CONTROL -> Category.Cc; + case Character.SPACE_SEPARATOR -> Category.Zs; + case Character.LINE_SEPARATOR -> Category.Zl; + case Character.PARAGRAPH_SEPARATOR -> Category.Zp; + case Character.FORMAT -> Category.Cf; + default -> null; + }; + } + + @Test + void testWhitespaceSetHasExactly25() { + assertEquals(25, UnicodeWhitespace.all().size()); + } + + @Test + void testLookalikeSetHasExactly6() { + assertEquals(6, UnicodeWhitespace.lookalikes().size()); + } + + @Test + void testRelatedCharacterExposesAttributes() { + final var bom = UnicodeWhitespace.lookalikes().stream() + .filter(r -> r.codePoint() == 0xFEFF).findFirst().orElseThrow(); + assertEquals("zero width no-break space", bom.name()); + assertEquals("BOM", bom.abbreviation()); + assertFalse(bom.note().isBlank()); + assertEquals("U+FEFF", bom.toUnicodeNotation()); + } + + @ParameterizedTest + @MethodSource("whitespace") + void testEachWhitespaceCharIsSelfConsistent(WhitespaceCharacter ws) { + assertTrue(UnicodeWhitespace.isWhitespace(ws.codePoint()), + () -> ws.toUnicodeNotation() + " should be whitespace"); + assertEquals(ws, UnicodeWhitespace.byCodePoint(ws.codePoint()).orElseThrow()); + assertFalse(UnicodeWhitespace.isLookalike(ws.codePoint()), + () -> ws.toUnicodeNotation() + " must not also be a look-alike"); + assertNotNull(ws.category()); + assertNotNull(ws.breaking()); + assertNotNull(ws.abbreviation()); + assertFalse(ws.name().isBlank()); + } + + @ParameterizedTest + @MethodSource("whitespace") + void testAllWhitespaceIsInTheBmp(WhitespaceCharacter ws) { + // Every Unicode White_Space code point is in the Basic Multilingual Plane (one char). + assertTrue(ws.codePoint() <= 0xFFFF, ws::toUnicodeNotation); + assertEquals(1, Character.charCount(ws.codePoint())); + } + + @ParameterizedTest + @MethodSource("whitespace") + void testCategoryMatchesJdkUnicodeData(WhitespaceCharacter ws) { + // Independent cross-check: our hand-entered category must agree with the JDK's UCD. + assertEquals(jdkCategory(ws.codePoint()), ws.category(), ws::toUnicodeNotation); + } + + @Test + void testCodePointsAreUniqueAndStrictlyAscending() { + final int[] cps = UnicodeWhitespace.codePoints(); + for (int i = 1; i < cps.length; i++) { + assertTrue(cps[i] > cps[i - 1], + "code points must be unique and ascending at index " + i); + } + } + + @Test + void testCodePointsMatchAllOrder() { + final int[] fromRecords = whitespace().stream().mapToInt(WhitespaceCharacter::codePoint).toArray(); + assertArrayEqualsInt(fromRecords, UnicodeWhitespace.codePoints()); + } + + @Test + void testCodePointsReturnsDefensiveCopy() { + final int[] first = UnicodeWhitespace.codePoints(); + first[0] = -999; + assertEquals(0x0009, UnicodeWhitespace.codePoints()[0]); + } + + @ParameterizedTest + @MethodSource("lookalikes") + void testLookalikesAreNotWhitespace(RelatedCharacter related) { + assertFalse(UnicodeWhitespace.isWhitespace(related.codePoint()), + () -> related.toUnicodeNotation() + " is White_Space=no"); + assertTrue(UnicodeWhitespace.byCodePoint(related.codePoint()).isEmpty()); + assertTrue(UnicodeWhitespace.isLookalike(related.codePoint())); + // Every look-alike is a format character in the UCD. + assertEquals(Category.Cf, jdkCategory(related.codePoint()), related::toUnicodeNotation); + } + + @Test + void testLineBreaksAreExactlyTheSeven() { + final Set expected = Set.of(0x000A, 0x000B, 0x000C, 0x000D, 0x0085, 0x2028, 0x2029); + assertEquals(expected, UnicodeWhitespace.lineBreaks().stream() + .map(WhitespaceCharacter::codePoint).collect(Collectors.toSet())); + } + + @Test + void testNonBreakingAreExactlyTheThree() { + final Set expected = Set.of(0x00A0, 0x2007, 0x202F); + assertEquals(expected, UnicodeWhitespace.nonBreaking().stream() + .map(WhitespaceCharacter::codePoint).collect(Collectors.toSet())); + } + + @ParameterizedTest + @ValueSource(ints = {0x0008, 0x000E, 0x001F, 0x0021, 0x1FFF, 0x200B, 0x202A, 0x2FFF, 0x3001}) + void testNeighboringCodePointsAreNotWhitespace(int codePoint) { + assertFalse(UnicodeWhitespace.isWhitespace(codePoint), + () -> String.format("U+%04X must not be whitespace", codePoint)); + } + + @Test + void testIncludesNbspAndNelThatJavaIsWhitespaceOmits() { + // Documents the deliberate divergence from Character.isWhitespace. + assertTrue(UnicodeWhitespace.isWhitespace(0x00A0)); + assertFalse(Character.isWhitespace(0x00A0)); + assertTrue(UnicodeWhitespace.isWhitespace(0x0085)); + assertFalse(Character.isWhitespace(0x0085)); + } + + @ParameterizedTest + @ValueSource(ints = {0x001C, 0x001D, 0x001E, 0x001F}) + void testExcludesInfoSeparatorsThatJavaIsWhitespaceIncludes(int codePoint) { + assertFalse(UnicodeWhitespace.isWhitespace(codePoint)); + assertTrue(Character.isWhitespace(codePoint)); + } + + @Test + void testIncludesTabThatIsSpaceCharOmits() { + // Character.isSpaceChar excludes the control whitespace; ours includes it. + assertTrue(UnicodeWhitespace.isWhitespace(0x0009)); + assertFalse(Character.isSpaceChar(0x0009)); + } + + @Test + void testByCodePointUnknownIsEmpty() { + assertTrue(UnicodeWhitespace.byCodePoint('A').isEmpty()); + assertTrue(UnicodeWhitespace.byCodePoint(0x200B).isEmpty(), "a look-alike is not whitespace"); + } + + @ParameterizedTest + @ValueSource(ints = {Integer.MIN_VALUE, -1, Character.MAX_CODE_POINT + 1, Integer.MAX_VALUE}) + void testIsWhitespaceHandlesOutOfRangeSafely(int codePoint) { + assertFalse(UnicodeWhitespace.isWhitespace(codePoint)); + assertFalse(UnicodeWhitespace.isLookalike(codePoint)); + } + + @Test + void testReferenceListsAreImmutable() { + assertThrows(UnsupportedOperationException.class, + () -> UnicodeWhitespace.all().add(null)); + assertThrows(UnsupportedOperationException.class, + () -> UnicodeWhitespace.lookalikes().add(null)); + assertThrows(UnsupportedOperationException.class, + () -> UnicodeWhitespace.lineBreaks().add(null)); + assertThrows(UnsupportedOperationException.class, + () -> UnicodeWhitespace.nonBreaking().add(null)); + } + + @Test + void testToUnicodeNotationIsZeroPadded() { + assertEquals("U+0009", UnicodeWhitespace.byCodePoint(0x0009).orElseThrow().toUnicodeNotation()); + assertEquals("U+00A0", UnicodeWhitespace.byCodePoint(0x00A0).orElseThrow().toUnicodeNotation()); + assertEquals("U+3000", UnicodeWhitespace.byCodePoint(0x3000).orElseThrow().toUnicodeNotation()); + } + + @Test + void testLineBreakAndNonBreakingFlagsAgreeWithBreaking() { + final WhitespaceCharacter lf = UnicodeWhitespace.byCodePoint(0x000A).orElseThrow(); + assertTrue(lf.isLineBreak()); + assertFalse(lf.isNonBreaking()); + + final WhitespaceCharacter nbsp = UnicodeWhitespace.byCodePoint(0x00A0).orElseThrow(); + assertTrue(nbsp.isNonBreaking()); + assertFalse(nbsp.isLineBreak()); + + final WhitespaceCharacter space = UnicodeWhitespace.byCodePoint(0x0020).orElseThrow(); + assertFalse(space.isLineBreak()); + assertFalse(space.isNonBreaking()); + } + + private static void assertArrayEqualsInt(int[] expected, int[] actual) { + assertEquals(Arrays.toString(expected), Arrays.toString(actual)); + assertTrue(IntStream.range(0, expected.length).allMatch(i -> expected[i] == actual[i])); + } +} diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/AbstractDL.java b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/AbstractDL.java index 6e6e54767..5b0a14f88 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/AbstractDL.java +++ b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/AbstractDL.java @@ -23,6 +23,7 @@ import java.nio.file.Files; import java.nio.file.Path; import java.util.ArrayList; +import java.util.Arrays; import java.util.HashMap; import java.util.List; import java.util.Map; @@ -40,6 +41,7 @@ import opennlp.tools.tokenize.BertTokenizer; import opennlp.tools.tokenize.Tokenizer; import opennlp.tools.tokenize.WordpieceTokenizer; +import opennlp.tools.util.normalizer.CharClass; /** * Base class for OpenNLP deep-learning classes using ONNX Runtime. @@ -327,6 +329,37 @@ protected static void validateSplitOptions(final int documentSplitSize, final in } } + /** + * Unicode-aware whitespace. Input is tokenized on the full Unicode {@code White_Space} set + * rather than the six ASCII characters Java's {@code \s} recognizes, and the same class is + * reused by subclasses that need to match against whitespace in the source text. + */ + protected static final CharClass WHITESPACE = CharClass.whitespace(); + + /** + * Splits {@code text} on Unicode whitespace and groups the resulting tokens into overlapping + * chunks, each rejoined with single ASCII spaces, ready for WordPiece tokenization. The split + * uses the Unicode {@code White_Space} set, so spacing such as a no-break space or the + * ideographic space is recognized, and it yields no empty tokens from leading, trailing, or + * repeated whitespace. + * + * @param text The input text. + * @param documentSplitSize The maximum number of whitespace tokens per chunk. + * @param splitOverlapSize The number of tokens shared between consecutive chunks. + * @return The chunk strings, in order. + */ + protected static List whitespaceChunks(final String text, final int documentSplitSize, + final int splitOverlapSize) { + final String[] whitespaceTokenized = WHITESPACE.split(text); + final List groups = new ArrayList<>(); + for (final ChunkRange chunkRange : chunkRanges( + whitespaceTokenized.length, documentSplitSize, splitOverlapSize)) { + groups.add(String.join(" ", + Arrays.copyOfRange(whitespaceTokenized, chunkRange.start(), chunkRange.end()))); + } + return groups; + } + /** * Splits a token sequence into overlapping chunk ranges. * diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/doccat/DocumentCategorizerDL.java b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/doccat/DocumentCategorizerDL.java index 7aa36e494..c7293fc8b 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/doccat/DocumentCategorizerDL.java +++ b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/doccat/DocumentCategorizerDL.java @@ -331,17 +331,10 @@ private List tokenize(final String text) { final List t = new LinkedList<>(); - // Segment long input text into overlapping chunks configured by InferenceOptions before - // feeding each chunk into BERT. + // Segment long input text into overlapping chunks (split on Unicode whitespace) configured by + // InferenceOptions before feeding each chunk into BERT. // https://medium.com/analytics-vidhya/text-classification-with-bert-using-transformers-for-long-text-inputs-f54833994dfd - final String[] whitespaceTokenized = text.split("\\s+"); - - for (ChunkRange chunkRange : chunkRanges( - whitespaceTokenized.length, documentSplitSize, splitOverlapSize)) { - - // The group is that subsection of string. - final String group = String.join(" ", - Arrays.copyOfRange(whitespaceTokenized, chunkRange.start(), chunkRange.end())); + for (final String group : whitespaceChunks(text, documentSplitSize, splitOverlapSize)) { // Now we can tokenize the group and continue. final String[] tokens = tokenizer.tokenize(group); diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java index e5b5c89b5..eff6b87d5 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java +++ b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java @@ -28,8 +28,6 @@ import java.util.Map; import java.util.Objects; import java.util.Set; -import java.util.regex.Matcher; -import java.util.regex.Pattern; import ai.onnxruntime.OnnxTensor; import ai.onnxruntime.OrtException; @@ -356,7 +354,7 @@ static List decodeSpans(String text, String[] tokens, float[][] tokenLabel continue; } - final SpanMatch match = findByRegex(text, spanText, characterStart, searchEnd); + final SpanMatch match = findInSource(text, spanText, characterStart, searchEnd); if (match.start() != -1) { spans.add(new Span(match.start(), match.end(), entityType, entity.probability())); characterStart = match.end(); @@ -567,35 +565,82 @@ private static int maxIndex(float[] arr) { /** * Locates reconstructed span text in a bounded region of the original input text. * + *

Matching is a single forward cursor scan, not a regular expression. Each space in the + * reconstructed span matches a run of zero or more Unicode whitespace characters in the source + * (so an entity whose WordPiece pieces were rejoined with spaces, such as {@code "AT & T"} for + * {@code "AT&T"}, is still located), and every other code point matches case-insensitively. + * Using a cursor avoids {@link java.util.regex.Pattern}/{@link java.util.regex.Matcher} + * allocation and the ReDoS surface of regular expressions, and recognizes Unicode whitespace + * that Java's {@code \s} does not.

+ * * @param text The original text. - * @param span The reconstructed span text. + * @param span The reconstructed span text, with sub-tokens separated by single ASCII spaces. * @param searchStart The first character offset to search from. * @param searchEnd The exclusive upper bound of the region to search. * @return The matched character offsets, or {@code (-1, -1)} when the reconstructed text * cannot be found in the requested region. */ - private static SpanMatch findByRegex(String text, String span, int searchStart, int searchEnd) { - - // Reconstructed span text normalizes whitespace, so match flexibly: a space in the span may - // map to any run of whitespace OR none in the source (e.g. punctuation/'&' inside "U.S.A", - // "AT&T" that wordpiece tokenization split apart). Use \s* rather than \s+ so such entities - // are still located instead of being silently dropped. - final String regex = Pattern.quote(span).replace(" ", "\\E\\s*\\Q"); + private static SpanMatch findInSource(String text, String span, int searchStart, int searchEnd) { - final Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE); - final Matcher matcher = pattern.matcher(text); final int regionStart = Math.min(Math.max(searchStart, 0), text.length()); final int regionEnd = Math.min(Math.max(searchEnd, regionStart), text.length()); - matcher.region(regionStart, regionEnd); - if (matcher.find()) { - return new SpanMatch(matcher.start(), matcher.end()); + int start = regionStart; + while (start < regionEnd) { + final int end = matchAt(text, span, start, regionEnd); + if (end != -1) { + return new SpanMatch(start, end); + } + start += Character.charCount(text.codePointAt(start)); } return new SpanMatch(-1, -1); } + /** + * Attempts to match {@code span} against {@code text} beginning at {@code start} and bounded by + * {@code regionEnd}. A space in {@code span} consumes a run of zero or more Unicode whitespace + * code points in the source; every other code point must match case-insensitively. + * + * @return The exclusive end offset of the match in {@code text}, or {@code -1} if no match + * begins at {@code start}. + */ + private static int matchAt(String text, String span, int start, int regionEnd) { + + int t = start; + int s = 0; + + while (s < span.length()) { + final int spanCp = span.codePointAt(s); + if (spanCp == ' ') { + while (t < regionEnd && WHITESPACE.contains(text.codePointAt(t))) { + t += Character.charCount(text.codePointAt(t)); + } + s += 1; + } else { + if (t >= regionEnd) { + return -1; + } + final int textCp = text.codePointAt(t); + if (!equalsIgnoreCase(spanCp, textCp)) { + return -1; + } + t += Character.charCount(textCp); + s += Character.charCount(spanCp); + } + } + + return t; + + } + + private static boolean equalsIgnoreCase(int a, int b) { + return a == b + || Character.toLowerCase(a) == Character.toLowerCase(b) + || Character.toUpperCase(a) == Character.toUpperCase(b); + } + private record LabelPrediction(String label, double probability) { } @@ -613,17 +658,10 @@ private List tokenize(final String text) { final List t = new LinkedList<>(); - // Segment long input text into overlapping chunks configured by InferenceOptions before - // feeding each chunk into BERT. + // Segment long input text into overlapping chunks (split on Unicode whitespace) configured by + // InferenceOptions before feeding each chunk into BERT. // https://medium.com/analytics-vidhya/text-classification-with-bert-using-transformers-for-long-text-inputs-f54833994dfd - final String[] whitespaceTokenized = text.split("\\s+"); - - for (ChunkRange chunkRange : chunkRanges( - whitespaceTokenized.length, documentSplitSize, splitOverlapSize)) { - - // The group is that subsection of string. - final String group = String.join(" ", - Arrays.copyOfRange(whitespaceTokenized, chunkRange.start(), chunkRange.end())); + for (final String group : whitespaceChunks(text, documentSplitSize, splitOverlapSize)) { // Now we can tokenize the group and continue. final String[] tokens = tokenizer.tokenize(group); diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/AbstractDLChunkingTest.java b/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/AbstractDLChunkingTest.java new file mode 100644 index 000000000..38ab38450 --- /dev/null +++ b/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/AbstractDLChunkingTest.java @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.dl; + +import java.util.List; + +import org.junit.jupiter.api.Test; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +/** + * Model-free tests for {@link AbstractDL#whitespaceChunks(String, int, int)}, the shared + * tokenize-and-chunk seam used by both {@code NameFinderDL} and {@code DocumentCategorizerDL}. + */ +public class AbstractDLChunkingTest { + + @Test + void testSplitsOnUnicodeWhitespaceNotJustAscii() { + // A no-break space (U+00A0) and an ideographic space (U+3000) are not matched by Java's \s + // but must still separate tokens; the chunk is rejoined with single ASCII spaces. + final String nbsp = new String(Character.toChars(0x00A0)); + final String ideographic = new String(Character.toChars(0x3000)); + assertEquals(List.of("alpha beta gamma"), + AbstractDL.whitespaceChunks("alpha" + nbsp + "beta" + ideographic + "gamma", 100, 0)); + } + + @Test + void testDropsEmptyTokensFromLeadingTrailingAndRepeatedWhitespace() { + // Unlike split("\\s+"), the Unicode-aware split yields no empty leading or trailing tokens. + assertEquals(List.of("a b c"), AbstractDL.whitespaceChunks(" a b\tc ", 100, 0)); + } + + @Test + void testAppliesChunkSizeWithoutOverlap() { + assertEquals(List.of("a b", "c d"), AbstractDL.whitespaceChunks("a b c d", 2, 0)); + } + + @Test + void testAppliesChunkOverlap() { + assertEquals(List.of("a b", "b c", "c d"), AbstractDL.whitespaceChunks("a b c d", 2, 1)); + } + + @Test + void testEmptyTextYieldsNoChunks() { + assertEquals(List.of(), AbstractDL.whitespaceChunks("", 100, 0)); + } +} diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/namefinder/NameFinderDLTest.java b/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/namefinder/NameFinderDLTest.java index c0a8aede2..1c97e0ad1 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/namefinder/NameFinderDLTest.java +++ b/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/namefinder/NameFinderDLTest.java @@ -169,7 +169,7 @@ void testDecodeSpansSearchStartLocatesNextOccurrence() { void testDecodeSpansLocatesEntityWithInternalPunctuation() { // WordPiece splits "AT&T" into separate AT / & / T tokens, so the reconstructed span text // ("AT & T") must still be located in the contiguous source. Regression guard for the - // flexible-whitespace (\s*) matching in findByRegex. + // flexible-whitespace matching in findInSource (a span space matches zero source whitespace). final String text = "Buy AT&T stock"; final String[] tokens = {"[CLS]", "Buy", "AT", "&", "T", "stock", "[SEP]"}; final float[][] scores = { @@ -184,6 +184,37 @@ void testDecodeSpansLocatesEntityWithInternalPunctuation() { assertEquals("AT&T", spans.get(0).getCoveredText(text)); } + @Test + void testDecodeSpansMatchesEntitySeparatedByNoBreakSpace() { + // The source separates "New" and "York" with a no-break space (U+00A0). Java's \s does not + // match it, so the previous regex matcher would have dropped this LOC span; the Unicode-aware + // cursor matcher locates it and the covered text includes the no-break space. + final String nbsp = new String(Character.toChars(0x00A0)); + final String text = "Visit New" + nbsp + "York today"; + final String[] tokens = {"[CLS]", "New", "York", "[SEP]"}; + final float[][] scores = {scoresFor(0), scoresFor(3), scoresFor(4), scoresFor(0)}; + + final List spans = NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS); + + assertEquals(1, spans.size()); + assertEquals("LOC", spans.get(0).getType()); + assertEquals("New" + nbsp + "York", spans.get(0).getCoveredText(text)); + } + + @Test + void testDecodeSpansMatchesEntitySeparatedByIdeographicSpace() { + // Same idea with the CJK ideographic space (U+3000), another character outside Java's \s. + final String ideographic = new String(Character.toChars(0x3000)); + final String text = "from New" + ideographic + "York city"; + final String[] tokens = {"[CLS]", "New", "York", "[SEP]"}; + final float[][] scores = {scoresFor(0), scoresFor(3), scoresFor(4), scoresFor(0)}; + + final List spans = NameFinderDL.decodeSpans(text, tokens, scores, ID_TO_LABELS); + + assertEquals(1, spans.size()); + assertEquals("New" + ideographic + "York", spans.get(0).getCoveredText(text)); + } + @Test void testDecodeSpansDoesNotMatchBeyondSearchEnd() { final String text = "London was quiet. Later Paris was loud."; diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/AccentFoldCharSequenceNormalizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/AccentFoldCharSequenceNormalizer.java new file mode 100644 index 000000000..3a940b1b8 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/AccentFoldCharSequenceNormalizer.java @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.text.Normalizer; +import java.util.Set; + +/** + * A {@link CharSequenceNormalizer} that folds diacritics for search and matching, the + * multilingual-safe counterpart to a Latin-only ASCII folding filter. + * + *

Folding decomposes the text (NFD) and drops nonspacing combining marks, but only for base + * characters whose script is in {@code foldScripts} (Latin, Greek, and Cyrillic by default). Marks + * on other scripts are left untouched, because there they are essential orthography rather than + * decoration: stripping an Indic vowel sign or a virama, an Arabic harakat, a Hebrew point, or a + * Thai vowel changes the word. This script gating is the key correctness rule; never strip all + * nonspacing marks globally.

+ * + *

Many "accented" Latin letters are atomic and do not decompose ({@code o} with stroke, the + * {@code ae}/{@code oe} ligatures, eszett, thorn, and so on). When {@code foldStrokeLetters} is + * enabled (the default) these are mapped to an ASCII approximation. Folding is a recall + * optimization, not a linguistically correct transform, so it is intended for a search/matching + * token rather than for display or language-specific analysis.

+ * + *

Scanning is a single cursor pass over the decomposed text; no regular expression is used, and + * no global {@code \p{Mn}} strip is performed.

+ */ +public class AccentFoldCharSequenceNormalizer implements CharSequenceNormalizer { + + private static final long serialVersionUID = 1L; + + private static final Set DEFAULT_SCRIPTS = Set.of( + Character.UnicodeScript.LATIN, + Character.UnicodeScript.GREEK, + Character.UnicodeScript.CYRILLIC); + + private static final AccentFoldCharSequenceNormalizer INSTANCE = + new AccentFoldCharSequenceNormalizer(DEFAULT_SCRIPTS, true); + + private final Set foldScripts; + private final boolean foldStrokeLetters; + + /** + * Creates a folder. + * + * @param foldScripts The scripts whose base characters' diacritics are folded; marks on every + * other script are preserved. + * @param foldStrokeLetters Whether atomic Latin letters such as the stroke letters and ligatures + * are mapped to an ASCII approximation. + */ + public AccentFoldCharSequenceNormalizer(Set foldScripts, + boolean foldStrokeLetters) { + this.foldScripts = Set.copyOf(foldScripts); + this.foldStrokeLetters = foldStrokeLetters; + } + + /** {@return the shared instance with the safe defaults: Latin, Greek, and Cyrillic plus the + * stroke-letter map} */ + public static AccentFoldCharSequenceNormalizer getInstance() { + return INSTANCE; + } + + @Override + public CharSequence normalize(CharSequence text) { + final String decomposed = Normalizer.normalize(text, Normalizer.Form.NFD); + final StringBuilder out = new StringBuilder(decomposed.length()); + + Character.UnicodeScript baseScript = null; + int i = 0; + final int length = decomposed.length(); + while (i < length) { + final int codePoint = decomposed.codePointAt(i); + if (Character.getType(codePoint) == Character.NON_SPACING_MARK) { + // Drop the mark only when its base character belongs to a folded script. + if (baseScript == null || !foldScripts.contains(baseScript)) { + out.appendCodePoint(codePoint); + } + } else { + final String mapped = foldStrokeLetters ? strokeLetter(codePoint) : null; + if (mapped != null) { + out.append(mapped); + baseScript = Character.UnicodeScript.LATIN; + } else { + out.appendCodePoint(codePoint); + baseScript = Character.UnicodeScript.of(codePoint); + } + } + i += Character.charCount(codePoint); + } + + return Normalizer.normalize(out, Normalizer.Form.NFC); + } + + // Atomic Latin letters that NFD does not decompose, mapped to an ASCII approximation. + private static String strokeLetter(int codePoint) { + return switch (codePoint) { + case 0x00F8 -> "o"; // o with stroke + case 0x00D8 -> "O"; // O with stroke + case 0x00E6 -> "ae"; // ae ligature + case 0x00C6 -> "AE"; // AE ligature + case 0x0153 -> "oe"; // oe ligature + case 0x0152 -> "OE"; // OE ligature + case 0x00DF -> "ss"; // eszett + case 0x1E9E -> "SS"; // capital eszett + case 0x00FE -> "th"; // thorn + case 0x00DE -> "TH"; // capital thorn + case 0x00F0 -> "d"; // eth + case 0x00D0 -> "D"; // capital eth + case 0x0111 -> "d"; // d with stroke + case 0x0110 -> "D"; // D with stroke + case 0x0142 -> "l"; // l with stroke + case 0x0141 -> "L"; // L with stroke + case 0x0127 -> "h"; // h with stroke + case 0x0126 -> "H"; // H with stroke + case 0x0131 -> "i"; // dotless i + default -> null; + }; + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/CaseFoldCharSequenceNormalizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/CaseFoldCharSequenceNormalizer.java new file mode 100644 index 000000000..176dd108b --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/CaseFoldCharSequenceNormalizer.java @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.Locale; + +/** + * A {@link CharSequenceNormalizer} that lower cases text for case-insensitive matching, using + * {@link Locale#ROOT} so the result does not depend on the JVM's default locale. + * + *

This is the case-folding step of a search / BM25 analysis chain (the counterpart to Lucene's + * lower-case filter). {@code Locale.ROOT} avoids locale surprises such as the Turkish dotless-i + * mapping; callers that need language-specific case rules should fold with an explicit locale + * upstream. Full Unicode case folding (for example German eszett, {@code U+00DF}, to {@code ss}) + * is a distinct, heavier transform and is intentionally out of scope here.

+ */ +public class CaseFoldCharSequenceNormalizer implements CharSequenceNormalizer { + + private static final long serialVersionUID = 1L; + + private static final CaseFoldCharSequenceNormalizer INSTANCE = + new CaseFoldCharSequenceNormalizer(); + + /** {@return the shared, stateless instance} */ + public static CaseFoldCharSequenceNormalizer getInstance() { + return INSTANCE; + } + + @Override + public CharSequence normalize(CharSequence text) { + return text.toString().toLowerCase(Locale.ROOT); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/DashCharSequenceNormalizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/DashCharSequenceNormalizer.java new file mode 100644 index 000000000..31237e73f --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/DashCharSequenceNormalizer.java @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +/** + * A {@link CharSequenceNormalizer} that maps every Unicode dash to an ASCII hyphen-minus + * ({@code U+002D}), reusing the cursor based {@link CharClass#dashes()} engine. + * + *

This folds the many dash code points (en dash, em dash, figure dash, non-breaking hyphen, + * fullwidth hyphen, and so on) to a single form so that {@code "state-of-the-art"} matches + * regardless of which dash the source used. The mathematical minus signs are left untouched by + * default, and {@code U+00AD} SOFT HYPHEN (a format character) is not treated as a dash.

+ */ +public class DashCharSequenceNormalizer implements CharSequenceNormalizer { + + private static final long serialVersionUID = 1L; + + private static final CharClass DASHES = CharClass.dashes(); + + private static final DashCharSequenceNormalizer INSTANCE = new DashCharSequenceNormalizer(); + + /** {@return the shared, stateless instance} */ + public static DashCharSequenceNormalizer getInstance() { + return INSTANCE; + } + + @Override + public CharSequence normalize(CharSequence text) { + return DASHES.normalize(text); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/NfcCharSequenceNormalizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/NfcCharSequenceNormalizer.java new file mode 100644 index 000000000..72d25d93b --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/NfcCharSequenceNormalizer.java @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.text.Normalizer; + +/** + * A {@link CharSequenceNormalizer} that applies Unicode Normalization Form C (canonical + * composition, UAX #15). + * + *

NFC is the safe, lossless (under canonical equivalence) baseline for matching: precomposed + * and decomposed spellings of the same text (for example {@code U+00E9} versus {@code e} plus a + * combining acute accent) become identical, so equal text compares equal regardless of how it was + * encoded. It changes no characters' meaning and is the W3C-recommended interchange form.

+ */ +public class NfcCharSequenceNormalizer implements CharSequenceNormalizer { + + private static final long serialVersionUID = 1L; + + private static final NfcCharSequenceNormalizer INSTANCE = new NfcCharSequenceNormalizer(); + + /** {@return the shared, stateless instance} */ + public static NfcCharSequenceNormalizer getInstance() { + return INSTANCE; + } + + @Override + public CharSequence normalize(CharSequence text) { + return Normalizer.normalize(text, Normalizer.Form.NFC); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/NfkcCharSequenceNormalizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/NfkcCharSequenceNormalizer.java new file mode 100644 index 000000000..c95568fab --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/NfkcCharSequenceNormalizer.java @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.text.Normalizer; + +/** + * A {@link CharSequenceNormalizer} that applies Unicode Normalization Form KC (compatibility + * composition, UAX #15). + * + *

NFKC folds compatibility variants to their canonical form: fullwidth and halfwidth letters, + * the {@code U+FB01} ligature to {@code fi}, and super/subscript digits to plain digits. It is + * more aggressive than {@link NfcCharSequenceNormalizer NFC} and is lossy (it can change a + * character's appearance or meaning, e.g. a squared numeral to a plain one), so it is a deliberate + * choice for search/recall rather than a safe default.

+ */ +public class NfkcCharSequenceNormalizer implements CharSequenceNormalizer { + + private static final long serialVersionUID = 1L; + + private static final NfkcCharSequenceNormalizer INSTANCE = new NfkcCharSequenceNormalizer(); + + /** {@return the shared, stateless instance} */ + public static NfkcCharSequenceNormalizer getInstance() { + return INSTANCE; + } + + @Override + public CharSequence normalize(CharSequence text) { + return Normalizer.normalize(text, Normalizer.Form.NFKC); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/WhitespaceCharSequenceNormalizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/WhitespaceCharSequenceNormalizer.java new file mode 100644 index 000000000..affa82745 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/WhitespaceCharSequenceNormalizer.java @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +/** + * A {@link CharSequenceNormalizer} that collapses each run of Unicode whitespace to a single ASCII + * space and trims the edges, reusing the cursor based {@link CharClass#whitespace()} engine. + * + *

Unlike a {@code \s} regular expression, this recognizes the full Unicode {@code White_Space} + * set (no-break space, ideographic space, the typographic spaces, line and paragraph separators, + * and so on), so spacing copied from the web, PDFs, or non-Latin sources normalizes consistently. + * It is the Unicode-aware, regex-free counterpart to {@link ShrinkCharSequenceNormalizer}.

+ */ +public class WhitespaceCharSequenceNormalizer implements CharSequenceNormalizer { + + private static final long serialVersionUID = 1L; + + private static final CharClass WHITESPACE = CharClass.whitespace(); + + private static final WhitespaceCharSequenceNormalizer INSTANCE = + new WhitespaceCharSequenceNormalizer(); + + /** {@return the shared, stateless instance} */ + public static WhitespaceCharSequenceNormalizer getInstance() { + return INSTANCE; + } + + @Override + public CharSequence normalize(CharSequence text) { + return WHITESPACE.trim(WHITESPACE.collapse(text)); + } +} diff --git a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/AccentFoldCharSequenceNormalizerTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/AccentFoldCharSequenceNormalizerTest.java new file mode 100644 index 000000000..ba4a6ea4b --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/AccentFoldCharSequenceNormalizerTest.java @@ -0,0 +1,115 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.Set; + +import org.junit.jupiter.api.Test; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertSame; +import static org.junit.jupiter.api.Assertions.assertTrue; + +public class AccentFoldCharSequenceNormalizerTest { + + private static String cp(int codePoint) { + return new String(Character.toChars(codePoint)); + } + + private static String fold(String text) { + return AccentFoldCharSequenceNormalizer.getInstance().normalize(text).toString(); + } + + @Test + void testFoldsLatinAccents() { + assertEquals("cafe", fold("caf" + cp(0x00E9))); // cafe with acute e + assertEquals("naive", fold("na" + cp(0x00EF) + "ve")); // naive with diaeresis i + assertEquals("Muller", fold("M" + cp(0x00FC) + "ller")); // Muller with umlaut u + assertEquals("anos", fold("a" + cp(0x00F1) + "os")); // anos with tilde n + } + + @Test + void testMapsStrokeAndLigatureLetters() { + assertEquals("o", fold(cp(0x00F8))); // o with stroke + assertEquals("ae", fold(cp(0x00E6))); // ae ligature + assertEquals("oe", fold(cp(0x0153))); // oe ligature + assertEquals("Strasse", fold("Stra" + cp(0x00DF) + "e")); // eszett + assertEquals("th", fold(cp(0x00FE))); // thorn + assertEquals("l", fold(cp(0x0142))); // l with stroke + assertEquals("i", fold(cp(0x0131))); // dotless i + } + + @Test + void testFoldsGreekAndCyrillicAccents() { + assertEquals(cp(0x03B1), fold(cp(0x03AC))); // Greek alpha with tonos -> alpha + assertEquals(cp(0x0438), fold(cp(0x0439))); // Cyrillic short i -> i + } + + @Test + void testLeavesAsciiUnchanged() { + assertEquals("hello world", fold("hello world")); + } + + @Test + void testDoesNotTouchDevanagariArabicOrHebrewMarks() { + // The critical guard: marks on non-folded scripts are essential orthography and must survive. + final String devanagari = cp(0x0915) + cp(0x093E); // ka + aa vowel sign + assertEquals(devanagari, fold(devanagari)); + + final String arabic = cp(0x0628) + cp(0x064E); // beh + fatha (a nonspacing mark) + assertEquals(arabic, fold(arabic)); + assertTrue(fold(arabic).indexOf(0x064E) >= 0, "the Arabic fatha must not be stripped"); + + final String hebrew = cp(0x05D0) + cp(0x05B8); // alef + qamats (a nonspacing mark) + assertEquals(hebrew, fold(hebrew)); + assertTrue(fold(hebrew).indexOf(0x05B8) >= 0, "the Hebrew point must not be stripped"); + } + + @Test + void testScriptScopeIsConfigurable() { + // With no folded scripts, Latin accents are preserved. + final AccentFoldCharSequenceNormalizer none = + new AccentFoldCharSequenceNormalizer(Set.of(), false); + assertEquals("caf" + cp(0x00E9), none.normalize("caf" + cp(0x00E9)).toString()); + + // Widening the scope to Arabic folds an Arabic mark that the default leaves untouched. + final AccentFoldCharSequenceNormalizer arabicToo = + new AccentFoldCharSequenceNormalizer(Set.of(Character.UnicodeScript.ARABIC), false); + assertEquals(cp(0x0628), arabicToo.normalize(cp(0x0628) + cp(0x064E)).toString()); + } + + @Test + void testStrokeLetterMappingIsConfigurable() { + final AccentFoldCharSequenceNormalizer noStroke = + new AccentFoldCharSequenceNormalizer(Set.of(Character.UnicodeScript.LATIN), false); + assertEquals(cp(0x00DF), noStroke.normalize(cp(0x00DF)).toString()); // eszett kept as-is + } + + @Test + void testComposesAfterCaseFold() { + final CharSequenceNormalizer pipeline = new AggregateCharSequenceNormalizer( + CaseFoldCharSequenceNormalizer.getInstance(), + AccentFoldCharSequenceNormalizer.getInstance()); + assertEquals("cafe", pipeline.normalize("CAF" + cp(0x00C9)).toString()); // CAFE with acute E + } + + @Test + void testInstanceIsSharedSingleton() { + assertSame(AccentFoldCharSequenceNormalizer.getInstance(), + AccentFoldCharSequenceNormalizer.getInstance()); + } +} diff --git a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/UnicodeCharSequenceNormalizerTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/UnicodeCharSequenceNormalizerTest.java new file mode 100644 index 000000000..7a700739f --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/UnicodeCharSequenceNormalizerTest.java @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import org.junit.jupiter.api.Test; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertSame; + +/** + * Tests for the {@link CharClass}-backed and Unicode-normalization {@link CharSequenceNormalizer} + * implementations, and their composition through {@link AggregateCharSequenceNormalizer}. + */ +public class UnicodeCharSequenceNormalizerTest { + + private static String cp(int codePoint) { + return new String(Character.toChars(codePoint)); + } + + @Test + void testWhitespaceCollapsesUnicodeRunsAndTrims() { + final String input = " a" + cp(0x00A0) + cp(0x00A0) + "b" + cp(0x3000) + " "; + assertEquals("a b", + WhitespaceCharSequenceNormalizer.getInstance().normalize(input).toString()); + } + + @Test + void testDashFoldsUnicodeDashesButNotMathMinus() { + assertEquals("a-b", + DashCharSequenceNormalizer.getInstance().normalize("a" + cp(0x2014) + "b").toString()); + final String math = "5" + cp(0x2212) + "3"; + assertEquals(math, DashCharSequenceNormalizer.getInstance().normalize(math).toString()); + } + + @Test + void testNfcComposesDecomposedSequences() { + // "e" + combining acute accent -> the precomposed letter U+00E9. + assertEquals(cp(0x00E9), + NfcCharSequenceNormalizer.getInstance().normalize("e" + cp(0x0301)).toString()); + } + + @Test + void testNfkcFoldsCompatibilityForms() { + assertEquals("A", + NfkcCharSequenceNormalizer.getInstance().normalize(cp(0xFF21)).toString()); + assertEquals("fi", + NfkcCharSequenceNormalizer.getInstance().normalize(cp(0xFB01)).toString()); + } + + @Test + void testCaseFoldLowercasesIndependentOfLocale() { + assertEquals("abc", CaseFoldCharSequenceNormalizer.getInstance().normalize("ABC").toString()); + // Accents are preserved; only case changes (CAFE-acute -> cafe-acute). + assertEquals("caf" + cp(0x00E9), + CaseFoldCharSequenceNormalizer.getInstance().normalize("CAF" + cp(0x00C9)).toString()); + } + + @Test + void testInstancesAreSharedSingletons() { + assertSame(WhitespaceCharSequenceNormalizer.getInstance(), + WhitespaceCharSequenceNormalizer.getInstance()); + assertSame(DashCharSequenceNormalizer.getInstance(), + DashCharSequenceNormalizer.getInstance()); + assertSame(NfcCharSequenceNormalizer.getInstance(), + NfcCharSequenceNormalizer.getInstance()); + assertSame(NfkcCharSequenceNormalizer.getInstance(), + NfkcCharSequenceNormalizer.getInstance()); + assertSame(CaseFoldCharSequenceNormalizer.getInstance(), + CaseFoldCharSequenceNormalizer.getInstance()); + } + + @Test + void testComposeIntoAUnifiedPipeline() { + // NFC, then Unicode whitespace, then dash folding, applied in order through the aggregate. + final CharSequenceNormalizer pipeline = new AggregateCharSequenceNormalizer( + NfcCharSequenceNormalizer.getInstance(), + WhitespaceCharSequenceNormalizer.getInstance(), + DashCharSequenceNormalizer.getInstance()); + + final String input = cp(0x00A0) + "a" + cp(0x2014) + "b" + cp(0x00A0); + assertEquals("a-b", pipeline.normalize(input).toString()); + } +} From ab15d7523bc768a39d46e3b291417cb8220501e3 Mon Sep 17 00:00:00 2001 From: Kristian Rickert Date: Thu, 18 Jun 2026 23:12:04 -0400 Subject: [PATCH 05/11] OPENNLP-1850 - Add quote/digit/invisible/ellipsis/bullet normalizers, the TextNormalizer pipeline, and offset-preserving TextAnalyzer Quote, digit, decimal, invisible-control, ellipsis, and bullet normalizers, all reusing the cursor-based CharClass engine (O(1) membership, no regex). TextNormalizer is a fluent builder that composes the rungs into an AggregateCharSequenceNormalizer, with a conservative searchDefault() chain. TextAnalyzer/AnalyzedToken tokenize and normalize per token while keeping each token's source span, the offset-preserving building block for BM25 matching. --- .../tools/util/normalizer/AnalyzedToken.java | 34 ++++ .../tools/util/normalizer/TextAnalyzer.java | 93 ++++++++++ .../util/normalizer/TextAnalyzerTest.java | 102 +++++++++++ .../BulletCharSequenceNormalizer.java | 51 ++++++ .../DigitCharSequenceNormalizer.java | 57 ++++++ .../EllipsisCharSequenceNormalizer.java | 60 +++++++ .../InvisibleCharSequenceNormalizer.java | 71 ++++++++ .../QuoteCharSequenceNormalizer.java | 69 ++++++++ .../tools/util/normalizer/TextNormalizer.java | 142 +++++++++++++++ .../AccentFoldCharSequenceNormalizerTest.java | 30 ++++ .../normalizer/SetBasedNormalizerTest.java | 163 ++++++++++++++++++ .../util/normalizer/TextNormalizerTest.java | 77 +++++++++ 12 files changed, 949 insertions(+) create mode 100644 opennlp-api/src/main/java/opennlp/tools/util/normalizer/AnalyzedToken.java create mode 100644 opennlp-api/src/main/java/opennlp/tools/util/normalizer/TextAnalyzer.java create mode 100644 opennlp-api/src/test/java/opennlp/tools/util/normalizer/TextAnalyzerTest.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/BulletCharSequenceNormalizer.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/DigitCharSequenceNormalizer.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/EllipsisCharSequenceNormalizer.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/InvisibleCharSequenceNormalizer.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/QuoteCharSequenceNormalizer.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/TextNormalizer.java create mode 100644 opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/SetBasedNormalizerTest.java create mode 100644 opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/TextNormalizerTest.java diff --git a/opennlp-api/src/main/java/opennlp/tools/util/normalizer/AnalyzedToken.java b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/AnalyzedToken.java new file mode 100644 index 000000000..389146596 --- /dev/null +++ b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/AnalyzedToken.java @@ -0,0 +1,34 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import opennlp.tools.util.Span; + +/** + * One analyzed token: its character span in the source text, the original token text, and the + * normalized form used for matching or indexing. + * + *

The span ties the normalized term back to the original text, so a search hit on + * {@link #normalized()} can be highlighted against the source using {@link #span()} even though + * the normalized form may differ in length (for example after diacritic folding).

+ * + * @param span The character span of the token in the source text. + * @param original The original token text. + * @param normalized The normalized token text (the match/index form). + */ +public record AnalyzedToken(Span span, String original, String normalized) { +} diff --git a/opennlp-api/src/main/java/opennlp/tools/util/normalizer/TextAnalyzer.java b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/TextAnalyzer.java new file mode 100644 index 000000000..7e8ce8d77 --- /dev/null +++ b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/TextAnalyzer.java @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.ArrayList; +import java.util.List; +import java.util.Objects; + +import opennlp.tools.util.Span; + +/** + * Splits text into tokens and normalizes each one, keeping every token's original character span. + * + *

This is the offset-preserving building block for search and BM25-style matching: tokens are + * found with a {@link CharClass} splitter (O(1) membership, a single cursor pass, no regular + * expression) and each token's text is run through a {@link CharSequenceNormalizer}. The result is + * a list of {@link AnalyzedToken}, each carrying the source {@link Span} alongside its normalized + * form, so a match on the normalized term can always be reported and highlighted against the + * original text even when normalization changes a token's length.

+ */ +public final class TextAnalyzer { + + private final CharClass splitter; + private final CharSequenceNormalizer normalizer; + + /** + * Creates an analyzer. + * + * @param splitter The character class whose members delimit tokens (typically + * {@link CharClass#whitespace()}). + * @param normalizer The per-token normalizer. + */ + public TextAnalyzer(CharClass splitter, CharSequenceNormalizer normalizer) { + this.splitter = Objects.requireNonNull(splitter, "splitter"); + this.normalizer = Objects.requireNonNull(normalizer, "normalizer"); + } + + /** + * Creates an analyzer that splits on Unicode whitespace. + * + * @param normalizer The per-token normalizer. + * @return The analyzer. + */ + public static TextAnalyzer whitespace(CharSequenceNormalizer normalizer) { + return new TextAnalyzer(CharClass.whitespace(), normalizer); + } + + /** + * Tokenizes {@code text} and normalizes each token. + * + * @param text The text to analyze. + * @return The analyzed tokens, in order, each with its source span and normalized form. + */ + public List analyze(CharSequence text) { + Objects.requireNonNull(text, "text"); + final List tokens = new ArrayList<>(); + for (final Span span : splitter.splitSpans(text)) { + final String original = text.subSequence(span.getStart(), span.getEnd()).toString(); + final String normalized = normalizer.normalize(original).toString(); + tokens.add(new AnalyzedToken(span, original, normalized)); + } + return tokens; + } + + /** + * Tokenizes {@code text} and returns only the normalized terms. + * + * @param text The text to analyze. + * @return The normalized token terms, in order. + */ + public List terms(CharSequence text) { + final List analyzed = analyze(text); + final List terms = new ArrayList<>(analyzed.size()); + for (final AnalyzedToken token : analyzed) { + terms.add(token.normalized()); + } + return terms; + } +} diff --git a/opennlp-api/src/test/java/opennlp/tools/util/normalizer/TextAnalyzerTest.java b/opennlp-api/src/test/java/opennlp/tools/util/normalizer/TextAnalyzerTest.java new file mode 100644 index 000000000..77decf860 --- /dev/null +++ b/opennlp-api/src/test/java/opennlp/tools/util/normalizer/TextAnalyzerTest.java @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.List; +import java.util.Locale; + +import org.junit.jupiter.api.Test; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.junit.jupiter.api.Assertions.assertTrue; + +public class TextAnalyzerTest { + + private static final CharSequenceNormalizer LOWER = s -> s.toString().toLowerCase(Locale.ROOT); + + private static String cp(int codePoint) { + return new String(Character.toChars(codePoint)); + } + + @Test + void testAnalyzePreservesSpansAndNormalizesTokens() { + final String text = "Hello WORLD"; + final List tokens = TextAnalyzer.whitespace(LOWER).analyze(text); + + assertEquals(2, tokens.size()); + assertEquals(0, tokens.get(0).span().getStart()); + assertEquals(5, tokens.get(0).span().getEnd()); + assertEquals("Hello", tokens.get(0).original()); + assertEquals("hello", tokens.get(0).normalized()); + assertEquals("WORLD", tokens.get(1).original()); + assertEquals("world", tokens.get(1).normalized()); + assertEquals("Hello", tokens.get(0).span().getCoveredText(text).toString()); + } + + @Test + void testSpanStaysCorrectWhenNormalizedLengthChanges() { + final CharSequenceNormalizer bracket = s -> "[" + s + "]"; + final String text = "ab cd"; + final List tokens = TextAnalyzer.whitespace(bracket).analyze(text); + + assertEquals("[ab]", tokens.get(0).normalized()); + assertEquals(0, tokens.get(0).span().getStart()); + assertEquals(2, tokens.get(0).span().getEnd()); + assertEquals(3, tokens.get(1).span().getStart()); + assertEquals(5, tokens.get(1).span().getEnd()); + } + + @Test + void testSplitsOnUnicodeWhitespace() { + final String text = "alpha" + cp(0x00A0) + "beta"; + final List tokens = TextAnalyzer.whitespace(LOWER).analyze(text); + + assertEquals(2, tokens.size()); + assertEquals("alpha", tokens.get(0).normalized()); + assertEquals("beta", tokens.get(1).normalized()); + } + + @Test + void testSupplementaryTokenIsKeptIntact() { + final String emoji = cp(0x1F600); + final String text = "a " + emoji + " b"; + final List tokens = TextAnalyzer.whitespace(LOWER).analyze(text); + + assertEquals(3, tokens.size()); + assertEquals(emoji, tokens.get(1).original()); + assertTrue(tokens.get(1).span().getEnd() - tokens.get(1).span().getStart() == emoji.length()); + } + + @Test + void testTermsReturnsNormalizedFormsOnly() { + assertEquals(List.of("a", "b", "c"), TextAnalyzer.whitespace(LOWER).terms("A B C")); + } + + @Test + void testEmptyAndWhitespaceOnlyYieldNoTokens() { + assertEquals(List.of(), TextAnalyzer.whitespace(LOWER).analyze("")); + assertEquals(List.of(), TextAnalyzer.whitespace(LOWER).analyze(" ")); + } + + @Test + void testRejectsNullArguments() { + assertThrows(NullPointerException.class, () -> new TextAnalyzer(null, LOWER)); + assertThrows(NullPointerException.class, () -> new TextAnalyzer(CharClass.whitespace(), null)); + assertThrows(NullPointerException.class, () -> TextAnalyzer.whitespace(LOWER).analyze(null)); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/BulletCharSequenceNormalizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/BulletCharSequenceNormalizer.java new file mode 100644 index 000000000..a58a45c53 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/BulletCharSequenceNormalizer.java @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +/** + * A {@link CharSequenceNormalizer} that replaces unambiguous list-bullet characters with a space, + * so a bullet acts as a token separator rather than sticking to the following word. + * + *

Membership is an O(1) {@link CharClass} lookup and scanning is a single cursor pass with no + * regular expression. The middle dot ({@code U+00B7}) is deliberately not included, + * because it is a letter in Catalan ({@code l..l}) and other orthographies; only characters that + * are unambiguously list bullets are replaced.

+ */ +public class BulletCharSequenceNormalizer implements CharSequenceNormalizer { + + private static final long serialVersionUID = 1L; + + private static final CharClass BULLETS = CharClass.of(CodePointSet.of( + 0x2022, // bullet + 0x2023, // triangular bullet + 0x2043, // hyphen bullet + 0x2219, // bullet operator + 0x25E6), // white bullet + 0x0020); + + private static final BulletCharSequenceNormalizer INSTANCE = new BulletCharSequenceNormalizer(); + + /** {@return the shared, stateless instance} */ + public static BulletCharSequenceNormalizer getInstance() { + return INSTANCE; + } + + @Override + public CharSequence normalize(CharSequence text) { + return BULLETS.normalize(text); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/DigitCharSequenceNormalizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/DigitCharSequenceNormalizer.java new file mode 100644 index 000000000..90ee0d3d1 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/DigitCharSequenceNormalizer.java @@ -0,0 +1,57 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +/** + * A {@link CharSequenceNormalizer} that maps Unicode decimal digits to their ASCII equivalents, + * so for example Arabic-Indic, Devanagari, or fullwidth digits all become {@code 0}-{@code 9}. + * + *

It maps a code point when {@link Character#digit(int, int)} reports a value of {@code 0}- + * {@code 9} in radix ten, that is, when the code point is a Unicode decimal digit. Other numeric + * forms (Roman numerals, superscripts, circled numbers, fractions) are not decimal digits and are + * left unchanged. Scanning is a single O(1)-per-code-point cursor pass with no regular + * expression.

+ */ +public class DigitCharSequenceNormalizer implements CharSequenceNormalizer { + + private static final long serialVersionUID = 1L; + + private static final DigitCharSequenceNormalizer INSTANCE = new DigitCharSequenceNormalizer(); + + /** {@return the shared, stateless instance} */ + public static DigitCharSequenceNormalizer getInstance() { + return INSTANCE; + } + + @Override + public CharSequence normalize(CharSequence text) { + final StringBuilder out = new StringBuilder(text.length()); + final int length = text.length(); + int i = 0; + while (i < length) { + final int codePoint = Character.codePointAt(text, i); + final int value = Character.digit(codePoint, 10); + if (value >= 0) { + out.append((char) ('0' + value)); + } else { + out.appendCodePoint(codePoint); + } + i += Character.charCount(codePoint); + } + return out.toString(); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/EllipsisCharSequenceNormalizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/EllipsisCharSequenceNormalizer.java new file mode 100644 index 000000000..8eccf2e5c --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/EllipsisCharSequenceNormalizer.java @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +/** + * A {@link CharSequenceNormalizer} that expands the ellipsis and leader characters to ASCII dots: + * the horizontal ellipsis ({@code U+2026}) to {@code "..."} and the two-dot leader + * ({@code U+2025}) to {@code ".."}. + * + *

Scanning is a single O(1)-per-code-point cursor pass with no regular expression. ASCII dot + * runs are left unchanged.

+ */ +public class EllipsisCharSequenceNormalizer implements CharSequenceNormalizer { + + private static final long serialVersionUID = 1L; + + private static final EllipsisCharSequenceNormalizer INSTANCE = + new EllipsisCharSequenceNormalizer(); + + /** {@return the shared, stateless instance} */ + public static EllipsisCharSequenceNormalizer getInstance() { + return INSTANCE; + } + + @Override + public CharSequence normalize(CharSequence text) { + final StringBuilder out = new StringBuilder(text.length()); + final int length = text.length(); + int i = 0; + while (i < length) { + final int codePoint = Character.codePointAt(text, i); + final String mapped = switch (codePoint) { + case 0x2026 -> "..."; // horizontal ellipsis + case 0x2025 -> ".."; // two dot leader + default -> null; + }; + if (mapped != null) { + out.append(mapped); + } else { + out.appendCodePoint(codePoint); + } + i += Character.charCount(codePoint); + } + return out.toString(); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/InvisibleCharSequenceNormalizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/InvisibleCharSequenceNormalizer.java new file mode 100644 index 000000000..3828f6f22 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/InvisibleCharSequenceNormalizer.java @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +/** + * A {@link CharSequenceNormalizer} that removes invisible format and bidirectional control + * characters that add no textual content and are a common source of noise and spoofing (the + * byte-order mark, zero width space, word joiner, bidi marks/embeddings/overrides/isolates, the + * invisible math operators, soft hyphen, and the Arabic letter mark). + * + *

Membership is an O(1) {@link CharClass} lookup and removal is a single cursor pass with no + * regular expression. The zero width joiner ({@code U+200D}) and non-joiner ({@code U+200C}) are + * deliberately kept, because they carry meaning in Persian, Indic scripts, and emoji + * sequences; so are variation selectors. Use this only for a matching/search form, not for + * display.

+ */ +public class InvisibleCharSequenceNormalizer implements CharSequenceNormalizer { + + private static final long serialVersionUID = 1L; + + // The replacement is unused: removeAll deletes members rather than substituting them. + private static final CharClass INVISIBLE = CharClass.of(CodePointSet.of( + 0x00AD, // soft hyphen + 0x061C, // arabic letter mark + 0x200B, // zero width space + 0x200E, // left-to-right mark + 0x200F, // right-to-left mark + 0x202A, // left-to-right embedding + 0x202B, // right-to-left embedding + 0x202C, // pop directional formatting + 0x202D, // left-to-right override + 0x202E, // right-to-left override + 0x2060, // word joiner + 0x2061, // function application + 0x2062, // invisible times + 0x2063, // invisible separator + 0x2064, // invisible plus + 0x2066, // left-to-right isolate + 0x2067, // right-to-left isolate + 0x2068, // first strong isolate + 0x2069, // pop directional isolate + 0xFEFF), // zero width no-break space (byte order mark) + 0x0020); + + private static final InvisibleCharSequenceNormalizer INSTANCE = + new InvisibleCharSequenceNormalizer(); + + /** {@return the shared, stateless instance} */ + public static InvisibleCharSequenceNormalizer getInstance() { + return INSTANCE; + } + + @Override + public CharSequence normalize(CharSequence text) { + return INVISIBLE.removeAll(text); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/QuoteCharSequenceNormalizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/QuoteCharSequenceNormalizer.java new file mode 100644 index 000000000..acef8dcd0 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/QuoteCharSequenceNormalizer.java @@ -0,0 +1,69 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +/** + * A {@link CharSequenceNormalizer} that folds typographic quotation marks to their ASCII forms: + * the single quotes and apostrophes to {@code '} and the double quotes to {@code "}. + * + *

This is high value for matching, since curly quotes, guillemets, and fullwidth quotes + * otherwise prevent {@code "don't"} from matching {@code "don" + U+2019 + "t"}. It is built from + * two {@link CharClass} sets, so membership is O(1) and scanning is a single cursor pass with no + * regular expression. ASCII quotes are left unchanged.

+ */ +public class QuoteCharSequenceNormalizer implements CharSequenceNormalizer { + + private static final long serialVersionUID = 1L; + + // Single quotes / apostrophes -> U+0027 APOSTROPHE. + private static final CharClass SINGLE = CharClass.of(CodePointSet.of( + 0x2018, // left single quotation mark + 0x2019, // right single quotation mark + 0x201A, // single low-9 quotation mark + 0x201B, // single high-reversed-9 quotation mark + 0x2039, // single left-pointing angle quotation mark + 0x203A, // single right-pointing angle quotation mark + 0x02BC, // modifier letter apostrophe + 0xFF07), // fullwidth apostrophe + '\''); + + // Double quotes -> U+0022 QUOTATION MARK. + private static final CharClass DOUBLE = CharClass.of(CodePointSet.of( + 0x201C, // left double quotation mark + 0x201D, // right double quotation mark + 0x201E, // double low-9 quotation mark + 0x201F, // double high-reversed-9 quotation mark + 0x00AB, // left-pointing double angle quotation mark + 0x00BB, // right-pointing double angle quotation mark + 0x301D, // reversed double prime quotation mark + 0x301E, // double prime quotation mark + 0x301F, // low double prime quotation mark + 0xFF02), // fullwidth quotation mark + '"'); + + private static final QuoteCharSequenceNormalizer INSTANCE = new QuoteCharSequenceNormalizer(); + + /** {@return the shared, stateless instance} */ + public static QuoteCharSequenceNormalizer getInstance() { + return INSTANCE; + } + + @Override + public CharSequence normalize(CharSequence text) { + return DOUBLE.normalize(SINGLE.normalize(text)); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/TextNormalizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/TextNormalizer.java new file mode 100644 index 000000000..c1bac6409 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/TextNormalizer.java @@ -0,0 +1,142 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.ArrayList; +import java.util.List; +import java.util.Objects; + +/** + * A fluent builder that composes the normalization rungs into a single + * {@link CharSequenceNormalizer}. + * + *

The rungs are applied in the order they are added, so the caller controls the chain. A + * conservative, search-oriented chain is available through {@link #searchDefault()}. Each rung is + * a shared, stateless normalizer; the built normalizer is an {@link AggregateCharSequenceNormalizer} + * that applies them in sequence.

+ * + *
{@code
+ * CharSequenceNormalizer n = TextNormalizer.builder()
+ *     .nfc().caseFold().accentFold()
+ *     .build();
+ * }
+ */ +public final class TextNormalizer { + + private final List steps = new ArrayList<>(); + + private TextNormalizer() { + } + + /** {@return a new, empty builder} */ + public static TextNormalizer builder() { + return new TextNormalizer(); + } + + /** {@return this builder with NFC canonical composition appended} */ + public TextNormalizer nfc() { + return add(NfcCharSequenceNormalizer.getInstance()); + } + + /** {@return this builder with NFKC compatibility composition appended} */ + public TextNormalizer nfkc() { + return add(NfkcCharSequenceNormalizer.getInstance()); + } + + /** {@return this builder with invisible/bidi control stripping appended} */ + public TextNormalizer stripInvisible() { + return add(InvisibleCharSequenceNormalizer.getInstance()); + } + + /** {@return this builder with Unicode whitespace collapsing appended} */ + public TextNormalizer whitespace() { + return add(WhitespaceCharSequenceNormalizer.getInstance()); + } + + /** {@return this builder with quotation-mark folding appended} */ + public TextNormalizer quotes() { + return add(QuoteCharSequenceNormalizer.getInstance()); + } + + /** {@return this builder with dash folding appended} */ + public TextNormalizer dashes() { + return add(DashCharSequenceNormalizer.getInstance()); + } + + /** {@return this builder with decimal-digit folding appended} */ + public TextNormalizer digits() { + return add(DigitCharSequenceNormalizer.getInstance()); + } + + /** {@return this builder with ellipsis expansion appended} */ + public TextNormalizer ellipsis() { + return add(EllipsisCharSequenceNormalizer.getInstance()); + } + + /** {@return this builder with list-bullet replacement appended} */ + public TextNormalizer bullets() { + return add(BulletCharSequenceNormalizer.getInstance()); + } + + /** {@return this builder with case folding appended} */ + public TextNormalizer caseFold() { + return add(CaseFoldCharSequenceNormalizer.getInstance()); + } + + /** {@return this builder with script-gated diacritic folding appended} */ + public TextNormalizer accentFold() { + return add(AccentFoldCharSequenceNormalizer.getInstance()); + } + + /** + * Appends a custom normalizer. + * + * @param custom The normalizer to append. + * @return This builder. + */ + public TextNormalizer with(CharSequenceNormalizer custom) { + return add(Objects.requireNonNull(custom, "custom")); + } + + /** {@return the composed normalizer for the rungs added so far} */ + public CharSequenceNormalizer build() { + return new AggregateCharSequenceNormalizer(steps.toArray(new CharSequenceNormalizer[0])); + } + + /** + * {@return a conservative search/matching chain} + * + *

The chain strips invisible controls, applies NFC, collapses whitespace, folds quotes and + * dashes, case folds, and finally applies script-gated diacritic folding.

+ */ + public static CharSequenceNormalizer searchDefault() { + return builder() + .stripInvisible() + .nfc() + .whitespace() + .quotes() + .dashes() + .caseFold() + .accentFold() + .build(); + } + + private TextNormalizer add(CharSequenceNormalizer normalizer) { + steps.add(normalizer); + return this; + } +} diff --git a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/AccentFoldCharSequenceNormalizerTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/AccentFoldCharSequenceNormalizerTest.java index ba4a6ea4b..5db1a4683 100644 --- a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/AccentFoldCharSequenceNormalizerTest.java +++ b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/AccentFoldCharSequenceNormalizerTest.java @@ -53,6 +53,36 @@ void testMapsStrokeAndLigatureLetters() { assertEquals("i", fold(cp(0x0131))); // dotless i } + @Test + void testEveryStrokeAndLigatureLetterMaps() { + assertEquals("o", fold(cp(0x00F8))); // o with stroke + assertEquals("O", fold(cp(0x00D8))); // O with stroke + assertEquals("ae", fold(cp(0x00E6))); // ae + assertEquals("AE", fold(cp(0x00C6))); // AE + assertEquals("oe", fold(cp(0x0153))); // oe + assertEquals("OE", fold(cp(0x0152))); // OE + assertEquals("ss", fold(cp(0x00DF))); // eszett + assertEquals("SS", fold(cp(0x1E9E))); // capital eszett + assertEquals("th", fold(cp(0x00FE))); // thorn + assertEquals("TH", fold(cp(0x00DE))); // capital thorn + assertEquals("d", fold(cp(0x00F0))); // eth + assertEquals("D", fold(cp(0x00D0))); // capital eth + assertEquals("d", fold(cp(0x0111))); // d with stroke + assertEquals("D", fold(cp(0x0110))); // D with stroke + assertEquals("l", fold(cp(0x0142))); // l with stroke + assertEquals("L", fold(cp(0x0141))); // L with stroke + assertEquals("h", fold(cp(0x0127))); // h with stroke + assertEquals("H", fold(cp(0x0126))); // H with stroke + assertEquals("i", fold(cp(0x0131))); // dotless i + } + + @Test + void testLeadingCombiningMarkWithNoBaseIsKept() { + // A combining mark with no preceding base (baseScript == null) must be kept, not dropped. + final String input = cp(0x0301) + "x"; // combining acute, then x + assertEquals(input, fold(input)); + } + @Test void testFoldsGreekAndCyrillicAccents() { assertEquals(cp(0x03B1), fold(cp(0x03AC))); // Greek alpha with tonos -> alpha diff --git a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/SetBasedNormalizerTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/SetBasedNormalizerTest.java new file mode 100644 index 000000000..ea333f06b --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/SetBasedNormalizerTest.java @@ -0,0 +1,163 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import org.junit.jupiter.api.Test; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertSame; + +public class SetBasedNormalizerTest { + + private static String cp(int codePoint) { + return new String(Character.toChars(codePoint)); + } + + private static String quotes(String text) { + return QuoteCharSequenceNormalizer.getInstance().normalize(text).toString(); + } + + private static String digits(String text) { + return DigitCharSequenceNormalizer.getInstance().normalize(text).toString(); + } + + private static String invisible(String text) { + return InvisibleCharSequenceNormalizer.getInstance().normalize(text).toString(); + } + + private static String ellipsis(String text) { + return EllipsisCharSequenceNormalizer.getInstance().normalize(text).toString(); + } + + private static String bullet(String text) { + return BulletCharSequenceNormalizer.getInstance().normalize(text).toString(); + } + + // --- quotes ------------------------------------------------------------------------------ + + @Test + void testQuotesFoldSingleAndDouble() { + assertEquals("don't", quotes("don" + cp(0x2019) + "t")); // right single quote + assertEquals("\"hi\"", quotes(cp(0x201C) + "hi" + cp(0x201D))); // curly double quotes + assertEquals("\"x\"", quotes(cp(0x00AB) + "x" + cp(0x00BB))); // guillemets + assertEquals("'y'", quotes(cp(0x2039) + "y" + cp(0x203A))); // single angle quotes + assertEquals("'", quotes(cp(0xFF07))); // fullwidth apostrophe + assertEquals("\"", quotes(cp(0xFF02))); // fullwidth quotation mark + assertEquals("'", quotes(cp(0x02BC))); // modifier letter apostrophe + } + + @Test + void testQuotesLeaveAsciiAndNonQuotesAlone() { + assertEquals("'a' \"b\"", quotes("'a' \"b\"")); + assertEquals("abc", quotes("abc")); + assertEquals(cp(0x2014), quotes(cp(0x2014))); // em dash is not a quote + } + + @Test + void testQuotesSingleton() { + assertSame(QuoteCharSequenceNormalizer.getInstance(), QuoteCharSequenceNormalizer.getInstance()); + } + + // --- digits ------------------------------------------------------------------------------ + + @Test + void testDigitsMapDecimalDigitsToAscii() { + assertEquals("123", digits(cp(0x0661) + cp(0x0662) + cp(0x0663))); // arabic-indic 1 2 3 + assertEquals("12", digits(cp(0x0967) + cp(0x0968))); // devanagari 1 2 + assertEquals("15", digits(cp(0xFF11) + cp(0xFF15))); // fullwidth 1 5 + assertEquals("a5b", digits("a" + cp(0x0665) + "b")); // arabic-indic 5 + } + + @Test + void testDigitsLeaveAsciiAndNonDecimalNumeralsAlone() { + assertEquals("0123456789", digits("0123456789")); + assertEquals(cp(0x00B2), digits(cp(0x00B2))); // superscript two (category No) + assertEquals(cp(0x2160), digits(cp(0x2160))); // roman numeral one (category Nl) + assertEquals(cp(0x00BD), digits(cp(0x00BD))); // vulgar fraction one half (category No) + assertEquals("abc", digits("abc")); + } + + @Test + void testDigitsSingleton() { + assertSame(DigitCharSequenceNormalizer.getInstance(), DigitCharSequenceNormalizer.getInstance()); + } + + // --- invisible / bidi controls ----------------------------------------------------------- + + @Test + void testInvisibleRemovesFormatAndBidiControls() { + assertEquals("ab", invisible("a" + cp(0xFEFF) + "b")); // byte order mark + assertEquals("ab", invisible("a" + cp(0x200B) + "b")); // zero width space + assertEquals("ab", invisible("a" + cp(0x2060) + "b")); // word joiner + assertEquals("softhyphen", invisible("soft" + cp(0x00AD) + "hyphen")); + assertEquals("evil", invisible(cp(0x202E) + "evil" + cp(0x202C))); // bidi override + pop + } + + @Test + void testInvisibleKeepsJoinersVariationSelectorsAndText() { + final String zwj = "a" + cp(0x200D) + "b"; // zero width joiner is meaningful + assertEquals(zwj, invisible(zwj)); + final String zwnj = "a" + cp(0x200C) + "b"; // zero width non-joiner is meaningful + assertEquals(zwnj, invisible(zwnj)); + final String family = cp(0x1F468) + cp(0x200D) + cp(0x1F469); // ZWJ emoji sequence preserved + assertEquals(family, invisible(family)); + assertEquals("hello", invisible("hello")); + } + + @Test + void testInvisibleSingleton() { + assertSame(InvisibleCharSequenceNormalizer.getInstance(), + InvisibleCharSequenceNormalizer.getInstance()); + } + + // --- ellipsis ---------------------------------------------------------------------------- + + @Test + void testEllipsisExpandsToAsciiDots() { + assertEquals("...", ellipsis(cp(0x2026))); // horizontal ellipsis + assertEquals("wait...", ellipsis("wait" + cp(0x2026))); + assertEquals("..", ellipsis(cp(0x2025))); // two dot leader + assertEquals("...", ellipsis("...")); // ascii dots unchanged + } + + @Test + void testEllipsisSingleton() { + assertSame(EllipsisCharSequenceNormalizer.getInstance(), + EllipsisCharSequenceNormalizer.getInstance()); + } + + // --- bullets ----------------------------------------------------------------------------- + + @Test + void testBulletsBecomeSeparatorSpaces() { + assertEquals(" item", bullet(cp(0x2022) + "item")); // bullet + assertEquals(" item", bullet(cp(0x25E6) + "item")); // white bullet + assertEquals("a b", bullet("a" + cp(0x2043) + "b")); // hyphen bullet + } + + @Test + void testBulletsLeaveMiddleDotAndTextAlone() { + assertEquals(cp(0x00B7), bullet(cp(0x00B7))); // middle dot kept (Catalan) + assertEquals("plain", bullet("plain")); + } + + @Test + void testBulletSingleton() { + assertSame(BulletCharSequenceNormalizer.getInstance(), + BulletCharSequenceNormalizer.getInstance()); + } +} diff --git a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/TextNormalizerTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/TextNormalizerTest.java new file mode 100644 index 000000000..64aa6df3a --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/TextNormalizerTest.java @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.Locale; + +import org.junit.jupiter.api.Test; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertThrows; + +public class TextNormalizerTest { + + private static String cp(int codePoint) { + return new String(Character.toChars(codePoint)); + } + + @Test + void testRungsApplyInOrder() { + final CharSequenceNormalizer n = TextNormalizer.builder().caseFold().accentFold().build(); + assertEquals("cafe", n.normalize("CAF" + cp(0x00C9)).toString()); // CAFE-acute -> cafe + } + + @Test + void testEmptyBuilderIsIdentity() { + assertEquals("UnChanged", TextNormalizer.builder().build().normalize("UnChanged").toString()); + } + + @Test + void testWhitespaceAndFoldChain() { + final CharSequenceNormalizer n = TextNormalizer.builder() + .nfc().whitespace().caseFold().accentFold().build(); + assertEquals("cafe", n.normalize(" CAF" + cp(0x00C9) + " ").toString()); + } + + @Test + void testWithCustomNormalizer() { + final CharSequenceNormalizer up = s -> s.toString().toUpperCase(Locale.ROOT); + assertEquals("AB", TextNormalizer.builder().with(up).build().normalize("ab").toString()); + } + + @Test + void testWithRejectsNull() { + assertThrows(NullPointerException.class, () -> TextNormalizer.builder().with(null)); + } + + @Test + void testSearchDefaultCleansMessyInput() { + // BOM + curly-quoted, mixed-case, accented text -> stripped, ASCII-quoted, folded. + final String input = cp(0xFEFF) + cp(0x201C) + "Caf" + cp(0x00C9) + cp(0x201D); + assertEquals("\"cafe\"", TextNormalizer.searchDefault().normalize(input).toString()); + } + + @Test + void testEveryRungIsInvokable() { + final CharSequenceNormalizer n = TextNormalizer.builder() + .stripInvisible().nfc().nfkc().whitespace().quotes().dashes().digits().ellipsis().bullets() + .caseFold().accentFold().build(); + // BOM stripped, Arabic-Indic 1 -> 1, case + accent folded. + final String input = cp(0xFEFF) + "CAF" + cp(0x00C9) + " " + cp(0x0661); + assertEquals("cafe 1", n.normalize(input).toString()); + } +} From 19fb1b6300f7d671f31bf58d8af10067d46a12b4 Mon Sep 17 00:00:00 2001 From: Kristian Rickert Date: Thu, 18 Jun 2026 23:12:04 -0400 Subject: [PATCH 06/11] OPENNLP-1850 - Add offset-safe input normalization opt-ins to the DL components InferenceOptions gains setNormalizeWhitespace and setNormalizeDashes (both off by default). When enabled, NameFinderDL and DocumentCategorizerDL fold input whitespace and/or dashes to their ASCII forms before inference via a shared AbstractDL.normalizeInput helper. The mapping is one code point to one ASCII character, so it is offset preserving for the Basic Multilingual Plane and any spans the model produces still align with the input. --- .../src/main/java/opennlp/dl/AbstractDL.java | 26 ++++++++++++++ .../java/opennlp/dl/InferenceOptions.java | 34 +++++++++++++++++++ .../dl/doccat/DocumentCategorizerDL.java | 11 +++++- .../opennlp/dl/namefinder/NameFinderDL.java | 7 +++- .../opennlp/dl/AbstractDLChunkingTest.java | 21 ++++++++++++ 5 files changed, 97 insertions(+), 2 deletions(-) diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/AbstractDL.java b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/AbstractDL.java index 5b0a14f88..483788366 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/AbstractDL.java +++ b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/AbstractDL.java @@ -336,6 +336,32 @@ protected static void validateSplitOptions(final int documentSplitSize, final in */ protected static final CharClass WHITESPACE = CharClass.whitespace(); + /** Unicode dashes (excluding the mathematical minus signs), used for optional input folding. */ + protected static final CharClass DASHES = CharClass.dashes(); + + /** + * Optionally folds Unicode whitespace and/or dashes in the input to their ASCII forms before + * inference. Each member code point maps to exactly one ASCII character, so the transform is + * offset preserving for Basic Multilingual Plane characters and any spans a model produces still + * align with the input. + * + * @param text The input text. + * @param normalizeWhitespace Whether to fold whitespace to ASCII spaces. + * @param normalizeDashes Whether to fold dashes to the ASCII hyphen. + * @return The optionally normalized text. + */ + protected static String normalizeInput(final String text, final boolean normalizeWhitespace, + final boolean normalizeDashes) { + String result = text; + if (normalizeWhitespace) { + result = WHITESPACE.normalize(result).toString(); + } + if (normalizeDashes) { + result = DASHES.normalize(result).toString(); + } + return result; + } + /** * Splits {@code text} on Unicode whitespace and groups the resulting tokens into overlapping * chunks, each rejoined with single ASCII spaces, ready for WordPiece tokenization. The split diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/InferenceOptions.java b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/InferenceOptions.java index 344c5846d..f74effb29 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/InferenceOptions.java +++ b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/InferenceOptions.java @@ -26,6 +26,8 @@ public class InferenceOptions { private int documentSplitSize = 250; private int splitOverlapSize = 50; private Boolean lowerCase; + private boolean normalizeWhitespace; + private boolean normalizeDashes; public boolean isIncludeAttentionMask() { return includeAttentionMask; @@ -75,6 +77,38 @@ public void setSplitOverlapSize(int splitOverlapSize) { this.splitOverlapSize = splitOverlapSize; } + /** {@return whether input whitespace is normalized to ASCII spaces before inference} */ + public boolean isNormalizeWhitespace() { + return normalizeWhitespace; + } + + /** + * Replaces every Unicode whitespace character in the input with an ASCII space before inference. + * This is offset preserving (each whitespace code point maps to one space), so any spans a model + * produces still align with the input. Off by default. + * + * @param normalizeWhitespace Whether to normalize whitespace. + */ + public void setNormalizeWhitespace(boolean normalizeWhitespace) { + this.normalizeWhitespace = normalizeWhitespace; + } + + /** {@return whether input dashes are normalized to the ASCII hyphen before inference} */ + public boolean isNormalizeDashes() { + return normalizeDashes; + } + + /** + * Replaces Unicode dashes in the input with the ASCII hyphen-minus before inference. This is + * offset preserving for the dash characters in the Basic Multilingual Plane (the common case). + * The mathematical minus signs are not affected. Off by default. + * + * @param normalizeDashes Whether to normalize dashes. + */ + public void setNormalizeDashes(boolean normalizeDashes) { + this.normalizeDashes = normalizeDashes; + } + /** * Returns whether tokenization should lower case the input text and strip * accents, as required by uncased models. diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/doccat/DocumentCategorizerDL.java b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/doccat/DocumentCategorizerDL.java index c7293fc8b..c73ef6de0 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/doccat/DocumentCategorizerDL.java +++ b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/doccat/DocumentCategorizerDL.java @@ -85,6 +85,8 @@ public class DocumentCategorizerDL extends AbstractDL implements DocumentCategor private final boolean includeTokenTypeIds; private final int documentSplitSize; private final int splitOverlapSize; + private final boolean normalizeWhitespace; + private final boolean normalizeDashes; /** * Test-only constructor that injects an already-built {@link OrtSession} (or {@code null}), @@ -101,6 +103,8 @@ public class DocumentCategorizerDL extends AbstractDL implements DocumentCategor this.includeTokenTypeIds = inferenceOptions.isIncludeTokenTypeIds(); this.documentSplitSize = inferenceOptions.getDocumentSplitSize(); this.splitOverlapSize = inferenceOptions.getSplitOverlapSize(); + this.normalizeWhitespace = inferenceOptions.isNormalizeWhitespace(); + this.normalizeDashes = inferenceOptions.isNormalizeDashes(); } /** @@ -132,6 +136,8 @@ public DocumentCategorizerDL(File model, File vocabulary, Map c this.includeTokenTypeIds = inferenceOptions.isIncludeTokenTypeIds(); this.documentSplitSize = inferenceOptions.getDocumentSplitSize(); this.splitOverlapSize = inferenceOptions.getSplitOverlapSize(); + this.normalizeWhitespace = inferenceOptions.isNormalizeWhitespace(); + this.normalizeDashes = inferenceOptions.isNormalizeDashes(); } @@ -165,6 +171,8 @@ public DocumentCategorizerDL(File model, File vocabulary, File config, this.includeTokenTypeIds = inferenceOptions.isIncludeTokenTypeIds(); this.documentSplitSize = inferenceOptions.getDocumentSplitSize(); this.splitOverlapSize = inferenceOptions.getSplitOverlapSize(); + this.normalizeWhitespace = inferenceOptions.isNormalizeWhitespace(); + this.normalizeDashes = inferenceOptions.isNormalizeDashes(); } @@ -327,8 +335,9 @@ private int getKey(String value) { } - private List tokenize(final String text) { + private List tokenize(final String input) { + final String text = normalizeInput(input, normalizeWhitespace, normalizeDashes); final List t = new LinkedList<>(); // Segment long input text into overlapping chunks (split on Unicode whitespace) configured by diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java index eff6b87d5..555f71323 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java +++ b/opennlp-core/opennlp-ml/opennlp-dl/src/main/java/opennlp/dl/namefinder/NameFinderDL.java @@ -102,6 +102,8 @@ public class NameFinderDL extends AbstractDL implements TokenNameFinder { private final boolean includeTokenTypeIds; private final int documentSplitSize; private final int splitOverlapSize; + private final boolean normalizeWhitespace; + private final boolean normalizeDashes; /** * Instantiates a {@link TokenNameFinder name finder} using ONNX models. @@ -151,6 +153,8 @@ public NameFinderDL(File model, File vocabulary, Map ids2Labels this.includeTokenTypeIds = inferenceOptions.isIncludeTokenTypeIds(); this.documentSplitSize = inferenceOptions.getDocumentSplitSize(); this.splitOverlapSize = inferenceOptions.getSplitOverlapSize(); + this.normalizeWhitespace = inferenceOptions.isNormalizeWhitespace(); + this.normalizeDashes = inferenceOptions.isNormalizeDashes(); this.sentenceDetector = sentenceDetector; } @@ -183,7 +187,8 @@ public Span[] find(String[] input) { final List spans = new ArrayList<>(); // Join the tokens here because they will be tokenized using Wordpiece during inference. - final String text = String.join(" ", input); + final String text = + normalizeInput(String.join(" ", input), normalizeWhitespace, normalizeDashes); // sentPosDetect (not sentDetect) so each sentence's offset in the full text is known. final Span[] sentenceSpans = sentenceDetector.sentPosDetect(text); diff --git a/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/AbstractDLChunkingTest.java b/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/AbstractDLChunkingTest.java index 38ab38450..386f47ee7 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/AbstractDLChunkingTest.java +++ b/opennlp-core/opennlp-ml/opennlp-dl/src/test/java/opennlp/dl/AbstractDLChunkingTest.java @@ -58,4 +58,25 @@ void testAppliesChunkOverlap() { void testEmptyTextYieldsNoChunks() { assertEquals(List.of(), AbstractDL.whitespaceChunks("", 100, 0)); } + + @Test + void testNormalizeInputIsOptInAndOffsetPreserving() { + final String nbsp = new String(Character.toChars(0x00A0)); + final String emDash = new String(Character.toChars(0x2014)); + final String input = "a" + nbsp + "b" + emDash + "c"; + + // Off by default: unchanged. + assertEquals(input, AbstractDL.normalizeInput(input, false, false)); + + // Whitespace only: the no-break space becomes a space, and the length is preserved. + final String ws = AbstractDL.normalizeInput(input, true, false); + assertEquals("a b" + emDash + "c", ws); + assertEquals(input.length(), ws.length()); + + // Dashes only: the em dash becomes an ASCII hyphen. + assertEquals("a" + nbsp + "b-c", AbstractDL.normalizeInput(input, false, true)); + + // Both. + assertEquals("a b-c", AbstractDL.normalizeInput(input, true, true)); + } } From 858fb7f571cecb57eea870e5f74494bdcbe90bc8 Mon Sep 17 00:00:00 2001 From: Kristian Rickert Date: Thu, 18 Jun 2026 23:12:04 -0400 Subject: [PATCH 07/11] OPENNLP-1850 - Document text normalization in the manual Add a Text Normalization chapter to the developer manual covering the normalizer family, the TextNormalizer pipeline, script-gated diacritic folding and its multilingual safety, the CharClass engine and user-defined code point sets, offset-preserving analysis, and the Unicode reference data. --- opennlp-docs/src/docbkx/normalizer.xml | 317 +++++++++++++++++++++++++ opennlp-docs/src/docbkx/opennlp.xml | 1 + 2 files changed, 318 insertions(+) create mode 100644 opennlp-docs/src/docbkx/normalizer.xml diff --git a/opennlp-docs/src/docbkx/normalizer.xml b/opennlp-docs/src/docbkx/normalizer.xml new file mode 100644 index 000000000..d14177db1 --- /dev/null +++ b/opennlp-docs/src/docbkx/normalizer.xml @@ -0,0 +1,317 @@ + + + + + + + Text Normalization + +
+ Introduction + + The package opennlp.tools.util.normalizer provides Unicode-aware text + normalization for matching, search, and tokenization preprocessing. It cleans up the + kinds of inconsistency that real text carries when it is copied from the web, PDFs, + office documents, or multilingual sources: spacing that is not an ordinary space, the + many dash and quotation variants, decomposed versus precomposed accents, non-ASCII + digits, and invisible control characters. + + + The implementation follows three principles: + + + + + Standards-sourced. Membership sets come from the + Unicode Character Database (for example the White_Space and + Dash properties), not from the JVM's locale-dependent or quirky + character predicates. The library never relies on + Character.isWhitespace, which disagrees with the Unicode standard. + + + + + Cursor-based, no regular expressions. Every + operation is a single forward pass over the input that tests membership in O(1) + and advances by code point. This avoids the allocation and the catastrophic + backtracking (ReDoS) risk of regular expressions, and it correctly recognizes + Unicode characters that Java's \s does not. + + + + + Offset-preserving. The original text is always + the source of truth. Normalization produces a derived form for matching while the + original character offsets are kept, so a search hit can be reported and + highlighted against the source even when the normalized form has a different + length. + + + + + There are two layers. The CharSequenceNormalizer family offers ready-made, + composable normalizers; the CharClass engine is the low-level, configurable + building block they are made of. + +
+ +
+ The normalizer family + + Each normalizer implements the existing + opennlp.tools.util.normalizer.CharSequenceNormalizer interface + (CharSequence normalize(CharSequence)) and is a shared, stateless singleton + obtained through getInstance(). They can therefore be combined with the + existing AggregateCharSequenceNormalizer, or with the + TextNormalizer builder described below. + + + + + + + Normalizer + Effect + + + + + WhitespaceCharSequenceNormalizer + Collapses each run of Unicode whitespace to a single ASCII space and + trims the edges. + + + DashCharSequenceNormalizer + Maps every Unicode dash to the ASCII hyphen-minus. The mathematical + minus signs and the soft hyphen are not affected. + + + QuoteCharSequenceNormalizer + Folds typographic single quotes and apostrophes to ' and + double quotes (including guillemets) to ". + + + DigitCharSequenceNormalizer + Maps Unicode decimal digits (Arabic-Indic, Devanagari, fullwidth, ...) + to ASCII 0-9 by their numeric value. + + + EllipsisCharSequenceNormalizer + Expands the horizontal ellipsis to ... and the two-dot + leader to .. + + + BulletCharSequenceNormalizer + Replaces unambiguous list bullets with a space; the Catalan middle dot + is left alone. + + + InvisibleCharSequenceNormalizer + Removes invisible format and bidirectional control characters (BOM, + zero width space, bidi marks/overrides/isolates, ...). The zero width + joiner and non-joiner and variation selectors are kept. + + + NfcCharSequenceNormalizer + Applies Unicode Normalization Form C (canonical composition); a safe, + lossless baseline for matching. + + + NfkcCharSequenceNormalizer + Applies Unicode Normalization Form KC (compatibility composition); + folds fullwidth forms, ligatures, and super/subscripts. + + + CaseFoldCharSequenceNormalizer + Lower cases for case-insensitive matching, using + Locale.ROOT. + + + AccentFoldCharSequenceNormalizer + Folds diacritics in a script-aware way (see below). + + + + + + + A single normalizer is applied directly: + + + + +
+ +
+ Composing a pipeline + + TextNormalizer is a fluent builder that composes the rungs, in the order + they are added, into a single CharSequenceNormalizer: + + + + + + A conservative search-oriented chain (strip invisibles, NFC, collapse whitespace, fold + quotes and dashes, case fold, then script-gated accent fold) is available directly: + + + + + + Any custom CharSequenceNormalizer can be inserted with + with(...). None of these normalizers is applied automatically by any OpenNLP + component; normalization is always an explicit, opt-in choice. + +
+ +
+ Diacritic folding and multilingual safety + + AccentFoldCharSequenceNormalizer folds accents for search, but does so in a + script-aware way that a Latin-only folding filter cannot. It decomposes the text, then + drops nonspacing combining marks only for base characters whose script is configured for + folding (Latin, Greek, and Cyrillic by default). Combining marks on other scripts are + left untouched, because there they are essential orthography rather than decoration: + dropping an Indic vowel sign or virama, an Arabic harakat, a Hebrew point, or a Thai + vowel would change the word. + + + alpha) +fold.normalize("का"); // unchanged (Devanagari is left intact)]]> + + + Atomic Latin letters that do not decompose are mapped to an ASCII approximation by + default: for example the stroke letters and ligatures, eszett, and thorn + (ø -> o, æ -> ae, ß -> ss, + þ -> th). Both behaviors are configurable through the constructor: + + + + + + Diacritic folding is a recall optimization, not a linguistically correct transform, so it + is intended for a search or matching form rather than for display. Language-specific case + and letter rules (for example German DIN umlaut expansion, or the Turkish + dotless-i) are out of scope for the default folder and should be applied with an explicit + locale upstream. + +
+ +
+ The CharClass engine and code point sets + + The set-based normalizers are built on CharClass, a configurable class of + Unicode code points paired with a single canonical replacement, backed by a + CodePointSet with O(1) membership. Whitespace and dashes are the two built-in + presets, and any other class is one more configured instance: + + + U+0020 +CharClass dash = CharClass.dashes(); // Unicode Dash (curated) -> U+002D + +ws.collapse("a b"); // "a b" (runs -> one space) +ws.trim(" hi "); // "hi" +String[] tokens = ws.split("one two"); // ["one", "two"] (offset-aware via splitSpans) +dash.normalize("a—b"); // "a-b"]]> + + + A CodePointSet can be built explicitly, as a range, by union, or loaded from + a user definitions file so that delimiters can be extended without a code change. The + file is line oriented and parsed with the same cursor approach (no regular expression): a + [name] line opens a section, a # begins a comment, and each + remaining line is a hex code point or an inclusive range. + + + + + + + +
+ +
+ Offset-preserving analysis for search + + TextAnalyzer tokenizes text and normalizes each token while keeping every + token's source span. This is the building block for BM25-style matching: the normalized + term is what you index or query, and the Span ties it back to the original + text for highlighting, even when normalization changes a token's length. + + + character span in the original text + // token.original() -> the raw token, e.g. "Café" + // token.normalized() -> the search term, e.g. "cafe" +} + +List terms = analyzer.terms("Café au lait"); // ["cafe", "au", "lait"]]]> + +
+ +
+ Reference data + + The underlying Unicode data is also available directly as immutable reference tables, + with O(1) membership tests that match the Unicode standard: + + + + + UnicodeWhitespace lists the 25 characters carrying the + White_Space property, plus the related look-alike format characters + (zero width space, byte order mark, ...) that are not + whitespace. It exposes isWhitespace(int), + byCodePoint(int), and helpers for the line breaks and the + non-breaking spaces. + + + + + UnicodeDash lists every code point carrying the Dash + property, distinguishing the mathematical minus signs that are excluded from the + default normalization set. + + + +
+ +
diff --git a/opennlp-docs/src/docbkx/opennlp.xml b/opennlp-docs/src/docbkx/opennlp.xml index 67eb1edf1..843bfbc9b 100644 --- a/opennlp-docs/src/docbkx/opennlp.xml +++ b/opennlp-docs/src/docbkx/opennlp.xml @@ -101,6 +101,7 @@ under the License. + From be2ad3dd11f4d8696f39cca4da405d9e2b3dceb9 Mon Sep 17 00:00:00 2001 From: Kristian Rickert Date: Fri, 19 Jun 2026 13:41:06 -0400 Subject: [PATCH 08/11] OPENNLP-1850 Unicode-aware text normalization and UAX #29 word tokenizer Additive Unicode text handling for matching, search, and tokenization preprocessing (new types only, no breaking changes). UAX #29 word tokenizer (opennlp.tools.tokenize.uax29): - WordSegmenter, WordTokenizer (implements opennlp.tools.tokenize.Tokenizer), and WordType. A single-pass, table-driven engine with O(1) Word_Break lookups and no regular expression; 100% conformant on the official Unicode 17.0 WordBreakTest suite (1944/1944). Offset-preserving spans and a zero-allocation streaming API. Text normalization (opennlp.tools.util.normalizer): - The layered Term model (Dimension, Term, TermAnalyzer): a token as a stack of normalization layers (NFC, NFKC, whitespace, dash, case fold, accent fold, confusable fold, stem, lemma) with eager configured layers, lazy memoized extras, and O(1) peel; integrates the UAX #29 tokenizer and the existing Stemmer/Lemmatizer as the token-level layers. - Confusable (homoglyph) skeleton folding per UTS #39, from the bundled Unicode security data. - Per-language profiles (NormalizationProfile, NormalizationProfiles) mirroring the Snowball algorithm set with LanguageDetector fallback, including a German DIN 5007-2 umlaut fold (a-umlaut to ae, eszett to ss). - First-class builder configuration: whitespace/dash fold targets, locale case folding, accent-fold script scope, and max token length, over a general transform(dimension, normalizer) hook. Documentation: a Text Normalization chapter and a UAX #29 tokenizer section in the manual; the bundled Unicode data files (WordBreakProperty, emoji-data, WordBreakTest, confusables) are attributed in NOTICE. Tests: UAX #29 boundary conformance and unit tests, and unit tests for the normalizer engine, term model, confusables, language profiles, and German fold. --- NOTICE | 21 + opennlp-core/opennlp-ml/opennlp-dl/README.md | 28 + .../tokenize/uax29/ExtendedPictographic.java | 87 + .../tools/tokenize/uax29/WordBreak.java | 99 + .../tokenize/uax29/WordBreakProperty.java | 157 + .../tools/tokenize/uax29/WordSegmenter.java | 397 + .../tools/tokenize/uax29/WordToken.java | 39 + .../tools/tokenize/uax29/WordTokenizer.java | 166 + .../tools/tokenize/uax29/WordType.java | 148 + .../CaseFoldCharSequenceNormalizer.java | 43 +- ...fusableSkeletonCharSequenceNormalizer.java | 47 + .../tools/util/normalizer/Confusables.java | 120 + .../tools/util/normalizer/Dimension.java | 62 + .../GermanUmlautCharSequenceNormalizer.java | 88 + .../util/normalizer/NormalizationProfile.java | 66 + .../normalizer/NormalizationProfiles.java | 118 + .../opennlp/tools/util/normalizer/Term.java | 112 + .../tools/util/normalizer/TermAnalyzer.java | 372 + .../tokenize/uax29/ExtendedPictographic.txt | 461 + .../tokenize/uax29/WordBreakProperty.txt | 1541 +++ .../tools/util/normalizer/confusables.txt | 9994 +++++++++++++++++ .../uax29/ExtendedPictographicTest.java | 46 + .../uax29/WordBoundaryConformanceTest.java | 96 + .../tokenize/uax29/WordBreakPropertyTest.java | 87 + .../tokenize/uax29/WordSegmenterTest.java | 110 + .../tokenize/uax29/WordTokenizerTest.java | 164 + .../CaseFoldCharSequenceNormalizerTest.java | 63 + .../util/normalizer/ConfusablesTest.java | 81 + ...ermanUmlautCharSequenceNormalizerTest.java | 65 + .../normalizer/NormalizationProfilesTest.java | 134 + .../util/normalizer/TermAnalyzerTest.java | 211 + .../tools/tokenize/uax29/WordBreakTest.txt | 1974 ++++ opennlp-docs/src/docbkx/doccat.xml | 18 + opennlp-docs/src/docbkx/introduction.xml | 3 +- opennlp-docs/src/docbkx/namefinder.xml | 27 +- opennlp-docs/src/docbkx/normalizer.xml | 248 +- opennlp-docs/src/docbkx/tokenizer.xml | 91 +- 37 files changed, 17567 insertions(+), 17 deletions(-) create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/ExtendedPictographic.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordBreak.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordBreakProperty.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordSegmenter.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordToken.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordTokenizer.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordType.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/ConfusableSkeletonCharSequenceNormalizer.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/Confusables.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/Dimension.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/GermanUmlautCharSequenceNormalizer.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/NormalizationProfile.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/NormalizationProfiles.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/Term.java create mode 100644 opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/TermAnalyzer.java create mode 100644 opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/tokenize/uax29/ExtendedPictographic.txt create mode 100644 opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/tokenize/uax29/WordBreakProperty.txt create mode 100644 opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/util/normalizer/confusables.txt create mode 100644 opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/ExtendedPictographicTest.java create mode 100644 opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/WordBoundaryConformanceTest.java create mode 100644 opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/WordBreakPropertyTest.java create mode 100644 opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/WordSegmenterTest.java create mode 100644 opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/WordTokenizerTest.java create mode 100644 opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/CaseFoldCharSequenceNormalizerTest.java create mode 100644 opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/ConfusablesTest.java create mode 100644 opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/GermanUmlautCharSequenceNormalizerTest.java create mode 100644 opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/NormalizationProfilesTest.java create mode 100644 opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/TermAnalyzerTest.java create mode 100644 opennlp-core/opennlp-runtime/src/test/resources/opennlp/tools/tokenize/uax29/WordBreakTest.txt diff --git a/NOTICE b/NOTICE index 340eca6b0..4a8b22075 100644 --- a/NOTICE +++ b/NOTICE @@ -92,6 +92,27 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +============================================================================ + +The Unicode Character Database data files in +opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/tokenize/uax29 +(WordBreakProperty.txt, the upstream WordBreakProperty-17.0.0.txt, and +ExtendedPictographic.txt, the upstream emoji-data.txt) and the conformance test +data in +opennlp-core/opennlp-runtime/src/test/resources/opennlp/tools/tokenize/uax29 +(WordBreakTest.txt, the upstream WordBreakTest-17.0.0.txt) are unmodified data +files from the Unicode Character Database, version 17.0.0, published by Unicode, +Inc. (https://www.unicode.org/Public/UCD/). The Unicode security data file +opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/util/normalizer/confusables.txt +is the unmodified confusables.txt, version 17.0.0, from the Unicode Security +Mechanisms (UTS #39, https://www.unicode.org/Public/security/). + +Copyright (c) 1991-2025 Unicode, Inc. All rights reserved. +Distributed under the Unicode Terms of Use and License +(https://www.unicode.org/terms_of_use.html, https://www.unicode.org/license.txt). +The original Unicode copyright and license header is preserved verbatim at the +top of each bundled file. + ============================================================================ List of third-party dependencies grouped by their license type. diff --git a/opennlp-core/opennlp-ml/opennlp-dl/README.md b/opennlp-core/opennlp-ml/opennlp-dl/README.md index 04a7715d4..853c0f953 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/README.md +++ b/opennlp-core/opennlp-ml/opennlp-dl/README.md @@ -22,6 +22,31 @@ Named entity models are commonly cased, so lower casing is disabled by default. Set `InferenceOptions#setLowerCase(true)` only for models trained with uncased input. +### Unicode text handling + +Long input is split into overlapping chunks on the full Unicode `White_Space` +set (not Java's `\s`), so no-break space, ideographic space, and the other UCD +whitespace characters are recognized as delimiters. `NameFinderDL` locates +reconstructed entity text in the original input with a cursor-based matcher that +treats span spaces as flexible Unicode whitespace and compares other code points +case-insensitively, so `Span#getCoveredText(...)` works on text from PDFs, the +web, and multilingual sources. + +Optional input folding is off by default and controlled through +`InferenceOptions`: + +```java +InferenceOptions options = new InferenceOptions(); +options.setNormalizeWhitespace(true); // each Unicode whitespace -> ASCII space (offset-preserving) +options.setNormalizeDashes(true); // Unicode dashes -> hyphen-minus (offset-preserving for BMP) +NameFinderDL finder = new NameFinderDL(model, vocab, ids2Labels, options, sentenceDetector); +``` + +The same options apply to `DocumentCategorizerDL`. The underlying +`CharClass` / `CodePointSet` engine and the broader normalization pipeline live +in `opennlp.tools.util.normalizer` and are documented in the OpenNLP manual +chapter *Text Normalization*. + Export a Hugging Face NER model to ONNX, e.g.: ```bash @@ -30,6 +55,9 @@ python -m transformers.onnx --model=dslim/bert-base-NER --feature token-classifi ## DocumentCategorizerDL +Uses the same Unicode whitespace chunking and optional `InferenceOptions` +normalization as `NameFinderDL` (see above). + Export a Huggingface classification (e.g. sentiment) model to ONNX, e.g.: ```bash diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/ExtendedPictographic.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/ExtendedPictographic.java new file mode 100644 index 000000000..46903bc1a --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/ExtendedPictographic.java @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.tokenize.uax29; + +import java.io.BufferedReader; +import java.io.IOException; +import java.io.InputStream; +import java.io.InputStreamReader; +import java.io.UncheckedIOException; +import java.nio.charset.StandardCharsets; +import java.util.BitSet; + +/** + * Tests the Unicode {@code Extended_Pictographic} property of a code point. + * + *

This is the one extra property the word boundary algorithm needs (rule WB3c), to keep emoji + * zero-width-joiner sequences together. The data is loaded once from the {@code emoji-data.txt} + * derived resource of the Unicode Character Database and stored in a {@link BitSet}, so membership + * is an O(1) bit test.

+ */ +public final class ExtendedPictographic { + + private static final String RESOURCE = "ExtendedPictographic.txt"; + + private static final BitSet MEMBERS = new BitSet(); + + static { + try (InputStream in = ExtendedPictographic.class.getResourceAsStream(RESOURCE)) { + if (in == null) { + throw new IllegalStateException("Missing Extended_Pictographic data resource: " + RESOURCE); + } + load(in); + } catch (IOException e) { + throw new UncheckedIOException("Unable to read Extended_Pictographic data resource", e); + } + } + + private ExtendedPictographic() { + } + + private static void load(InputStream in) throws IOException { + try (BufferedReader reader = + new BufferedReader(new InputStreamReader(in, StandardCharsets.UTF_8))) { + String line; + while ((line = reader.readLine()) != null) { + final int hash = line.indexOf('#'); + final String content = (hash < 0 ? line : line.substring(0, hash)).strip(); + if (content.isEmpty()) { + continue; + } + final int semicolon = content.indexOf(';'); + final String codePoints = (semicolon < 0 ? content : content.substring(0, semicolon)).strip(); + final int dots = codePoints.indexOf(".."); + if (dots < 0) { + MEMBERS.set(Integer.parseInt(codePoints, 16)); + } else { + final int start = Integer.parseInt(codePoints.substring(0, dots), 16); + final int end = Integer.parseInt(codePoints.substring(dots + 2), 16); + MEMBERS.set(start, end + 1); + } + } + } + } + + /** + * {@return whether a code point has the {@code Extended_Pictographic} property} + * + * @param codePoint The code point. Values outside {@code [0, U+10FFFF]} return {@code false}. + */ + public static boolean is(int codePoint) { + return codePoint >= 0 && codePoint <= Character.MAX_CODE_POINT && MEMBERS.get(codePoint); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordBreak.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordBreak.java new file mode 100644 index 000000000..570df2b91 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordBreak.java @@ -0,0 +1,99 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.tokenize.uax29; + +/** + * The Unicode {@code Word_Break} property values, used by the UAX #29 word boundary algorithm. + * + *

{@link #OTHER} is the default for code points that carry no {@code Word_Break} value in the + * Unicode Character Database. The remaining constants correspond one-to-one to the values in + * {@code WordBreakProperty.txt} (see + * UAX #29).

+ */ +public enum WordBreak { + + /** No assigned {@code Word_Break} value (the default). */ + OTHER, + /** Carriage return ({@code U+000D}). */ + CR, + /** Line feed ({@code U+000A}). */ + LF, + /** Other mandatory line breaks (vertical tab, form feed, NEL, line/paragraph separators). */ + NEWLINE, + /** Combining marks and other characters that extend the preceding one. */ + EXTEND, + /** Zero width joiner ({@code U+200D}). */ + ZWJ, + /** Regional indicator symbols (used in pairs for flag emoji). */ + REGIONAL_INDICATOR, + /** Format characters. */ + FORMAT, + /** Katakana letters. */ + KATAKANA, + /** Hebrew letters (distinguished so a single quote may join them). */ + HEBREW_LETTER, + /** Alphabetic letters. */ + ALETTER, + /** The apostrophe ({@code U+0027}). */ + SINGLE_QUOTE, + /** The quotation mark ({@code U+0022}). */ + DOUBLE_QUOTE, + /** Characters that join letters or numbers (for example the full stop). */ + MID_NUM_LET, + /** Characters that join letters (for example the middle dot). */ + MID_LETTER, + /** Characters that join numbers (for example the comma). */ + MID_NUM, + /** Decimal digits. */ + NUMERIC, + /** Characters that extend a number or letter sequence (for example the low line). */ + EXTEND_NUM_LET, + /** Whitespace that segments words ({@code Word_Break=WSegSpace}). */ + WSEG_SPACE; + + /** + * Maps a {@code Word_Break} value name, as written in {@code WordBreakProperty.txt}, to its + * constant. + * + * @param name The property value name (for example {@code ALetter}). + * @return The matching constant. + * @throws IllegalArgumentException Thrown if the name is not a known {@code Word_Break} value. + */ + static WordBreak fromPropertyName(String name) { + return switch (name) { + case "CR" -> CR; + case "LF" -> LF; + case "Newline" -> NEWLINE; + case "Extend" -> EXTEND; + case "ZWJ" -> ZWJ; + case "Regional_Indicator" -> REGIONAL_INDICATOR; + case "Format" -> FORMAT; + case "Katakana" -> KATAKANA; + case "Hebrew_Letter" -> HEBREW_LETTER; + case "ALetter" -> ALETTER; + case "Single_Quote" -> SINGLE_QUOTE; + case "Double_Quote" -> DOUBLE_QUOTE; + case "MidNumLet" -> MID_NUM_LET; + case "MidLetter" -> MID_LETTER; + case "MidNum" -> MID_NUM; + case "Numeric" -> NUMERIC; + case "ExtendNumLet" -> EXTEND_NUM_LET; + case "WSegSpace" -> WSEG_SPACE; + default -> throw new IllegalArgumentException("Unknown Word_Break value: " + name); + }; + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordBreakProperty.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordBreakProperty.java new file mode 100644 index 000000000..ed4fa5189 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordBreakProperty.java @@ -0,0 +1,157 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.tokenize.uax29; + +import java.io.BufferedReader; +import java.io.IOException; +import java.io.InputStream; +import java.io.InputStreamReader; +import java.io.UncheckedIOException; +import java.nio.charset.StandardCharsets; +import java.util.ArrayList; +import java.util.List; + +/** + * Looks up the Unicode {@link WordBreak Word_Break} property of a code point. + * + *

The data is loaded once from the {@code WordBreakProperty.txt} resource of the Unicode + * Character Database (parsed with simple cursor scanning, no regular expression). Lookup is O(1) + * for the Basic Multilingual Plane (a direct array index) and O(log n) for supplementary code + * points (a binary search over a small sorted range table), so it imposes no per-character + * allocation on the word boundary algorithm.

+ */ +public final class WordBreakProperty { + + private static final String RESOURCE = "WordBreakProperty.txt"; + + private static final WordBreak[] VALUES = WordBreak.values(); + + // Word_Break value ordinal for each BMP code point; the default 0 is WordBreak.OTHER. + private static final byte[] BMP = new byte[0x10000]; + + // Supplementary ranges (above the BMP), sorted by start for binary search. + private static final int[] SUPPLEMENTARY_START; + private static final int[] SUPPLEMENTARY_END; + private static final byte[] SUPPLEMENTARY_VALUE; + + static { + final List supplementary = new ArrayList<>(); + try (InputStream in = WordBreakProperty.class.getResourceAsStream(RESOURCE)) { + if (in == null) { + throw new IllegalStateException("Missing Word_Break data resource: " + RESOURCE); + } + load(in, supplementary); + } catch (IOException e) { + throw new UncheckedIOException("Unable to read Word_Break data resource " + RESOURCE, e); + } + supplementary.sort((a, b) -> Integer.compare(a[0], b[0])); + SUPPLEMENTARY_START = new int[supplementary.size()]; + SUPPLEMENTARY_END = new int[supplementary.size()]; + SUPPLEMENTARY_VALUE = new byte[supplementary.size()]; + for (int i = 0; i < supplementary.size(); i++) { + final int[] range = supplementary.get(i); + SUPPLEMENTARY_START[i] = range[0]; + SUPPLEMENTARY_END[i] = range[1]; + SUPPLEMENTARY_VALUE[i] = (byte) range[2]; + } + } + + private WordBreakProperty() { + } + + private static void load(InputStream in, List supplementary) throws IOException { + try (BufferedReader reader = + new BufferedReader(new InputStreamReader(in, StandardCharsets.UTF_8))) { + String line; + while ((line = reader.readLine()) != null) { + final int hash = line.indexOf('#'); + final String content = (hash < 0 ? line : line.substring(0, hash)).strip(); + if (content.isEmpty()) { + continue; + } + final int semicolon = content.indexOf(';'); + final String codePoints = content.substring(0, semicolon).strip(); + final String value = content.substring(semicolon + 1).strip(); + final byte ordinal = (byte) WordBreak.fromPropertyName(value).ordinal(); + + final int dots = codePoints.indexOf(".."); + final int start; + final int end; + if (dots < 0) { + start = Integer.parseInt(codePoints, 16); + end = start; + } else { + start = Integer.parseInt(codePoints.substring(0, dots), 16); + end = Integer.parseInt(codePoints.substring(dots + 2), 16); + } + assign(start, end, ordinal, supplementary); + } + } + } + + private static void assign(int start, int end, byte ordinal, List supplementary) { + final int bmpEnd = Math.min(end, 0xFFFF); + for (int codePoint = start; codePoint <= bmpEnd; codePoint++) { + BMP[codePoint] = ordinal; + } + if (end > 0xFFFF) { + supplementary.add(new int[] {Math.max(start, 0x10000), end, ordinal}); + } + } + + /** + * {@return the {@link WordBreak} value of a code point} + * + * @param codePoint The code point. Values outside {@code [0, U+10FFFF]} return + * {@link WordBreak#OTHER}. + */ + public static WordBreak of(int codePoint) { + return VALUES[ordinalOf(codePoint)]; + } + + /** + * {@return the {@link WordBreak#ordinal() ordinal} of a code point's {@link WordBreak} value} + * This is the allocation-free form of {@link #of(int)} for hot loops that work with ordinals. + * + * @param codePoint The code point. Values outside {@code [0, U+10FFFF]} return the ordinal of + * {@link WordBreak#OTHER}. + */ + public static int ordinalOf(int codePoint) { + if (codePoint >= 0 && codePoint <= 0xFFFF) { + return BMP[codePoint]; + } + return ordinalOfSupplementary(codePoint); + } + + private static int ordinalOfSupplementary(int codePoint) { + if (codePoint > 0xFFFF && codePoint <= Character.MAX_CODE_POINT) { + int low = 0; + int high = SUPPLEMENTARY_START.length - 1; + while (low <= high) { + final int mid = (low + high) >>> 1; + if (codePoint < SUPPLEMENTARY_START[mid]) { + high = mid - 1; + } else if (codePoint > SUPPLEMENTARY_END[mid]) { + low = mid + 1; + } else { + return SUPPLEMENTARY_VALUE[mid]; + } + } + } + return WordBreak.OTHER.ordinal(); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordSegmenter.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordSegmenter.java new file mode 100644 index 000000000..eddbebacc --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordSegmenter.java @@ -0,0 +1,397 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.tokenize.uax29; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import opennlp.tools.util.Span; + +/** + * Finds word boundaries in text using the Unicode Text Segmentation algorithm + * (UAX #29), rules WB1 through WB999. + * + *

The implementation is a single forward cursor pass with O(1) {@link WordBreakProperty} + * lookups and no regular expression. It decodes each code point once, keeps only a constant amount + * of state, and allocates nothing per character. It implements the "ignore" semantics of WB4 (a + * base character absorbs following {@code Extend}, {@code Format}, and {@code ZWJ}), the look-ahead + * rules WB6/WB7/WB7b/WB12, the Hebrew quote rules WB7a-WB7c, the emoji zero-width-joiner rule WB3c, + * and regional-indicator pairing WB15/WB16. The look-ahead for the WB6/WB7b/WB12 rules is resolved + * lazily and only at mid-word punctuation, so the common case never scans ahead.

+ * + *

{@link #forEachSegment(CharSequence, SegmentConsumer)} streams the segments with no + * allocation; {@link #boundaries(CharSequence)} returns every boundary offset (always including + * {@code 0} and the text length); {@link #segments(CharSequence)} returns the spans between + * them.

+ */ +public final class WordSegmenter { + + /** Receives each word segment as the half-open character range {@code [start, end)}. */ + @FunctionalInterface + public interface SegmentConsumer { + /** + * Accepts one segment. + * + * @param start The inclusive start character offset. + * @param end The exclusive end character offset. + */ + void accept(int start, int end); + } + + // Decisions for the WB5-WB999 rules. NO_BREAK/BREAK are final; CONSULT marks a (last, current) + // pair whose decision also depends on look-ahead or regional-indicator parity, so the full rule + // cascade must be consulted. GO_SLOW appears only in the FAST table (never in TRANSITION) and + // marks a current class that can trigger a WB3-family or WB4 rule. + private static final byte NO_BREAK = 0; + private static final byte BREAK = 1; + private static final byte CONSULT = 2; + private static final byte GO_SLOW = 3; + + private static final WordBreak[] CLASSES = WordBreak.values(); + private static final int CLASS_COUNT = CLASSES.length; + + // TRANSITION[last * CLASS_COUNT + current] holds the WB5-WB999 decision for a (last, current) + // pair: NO_BREAK or BREAK when the decision is the same for every secondLast, next significant + // value, and parity, or CONSULT otherwise. The table is derived from afterPrefix(...) at + // class-load, so it is equivalent to the rule cascade by construction; only the hot path reads it. + private static final byte[] TRANSITION = buildTransitionTable(); + + // Ordinals of the Word_Break classes that the WB3 family and WB4 examine. The hot loop works with + // ordinals to avoid materializing a WordBreak enum per character. + private static final int OTHER_ORDINAL = WordBreak.OTHER.ordinal(); + private static final int CR_ORDINAL = WordBreak.CR.ordinal(); + private static final int LF_ORDINAL = WordBreak.LF.ordinal(); + private static final int NEWLINE_ORDINAL = WordBreak.NEWLINE.ordinal(); + private static final int ZWJ_ORDINAL = WordBreak.ZWJ.ordinal(); + private static final int WSEG_SPACE_ORDINAL = WordBreak.WSEG_SPACE.ordinal(); + private static final int EXTEND_ORDINAL = WordBreak.EXTEND.ordinal(); + private static final int FORMAT_ORDINAL = WordBreak.FORMAT.ordinal(); + private static final int REGIONAL_INDICATOR_ORDINAL = WordBreak.REGIONAL_INDICATOR.ordinal(); + + // SPECIAL[ordinal] is true for the classes that can trigger a WB3-family or WB4 rule (the + // newline, ZWJ, word-segment-space, and ignorable classes). When neither the previous nor the + // current class is special, those rules cannot fire and the hot loop goes straight to the + // transition table. + private static final boolean[] SPECIAL = buildSpecialTable(); + + // FAST[last * CLASS_COUNT + current] is the hot-loop table: the TRANSITION decision when the + // current class is ordinary, or GO_SLOW when it is special. One read decides the common case and + // detects a special current class, so the loop never reloads SPECIAL[current]. + private static final byte[] FAST = buildFastTable(); + + private WordSegmenter() { + } + + private static boolean[] buildSpecialTable() { + final boolean[] special = new boolean[CLASS_COUNT]; + special[CR_ORDINAL] = true; + special[LF_ORDINAL] = true; + special[NEWLINE_ORDINAL] = true; + special[ZWJ_ORDINAL] = true; + special[WSEG_SPACE_ORDINAL] = true; + special[EXTEND_ORDINAL] = true; + special[FORMAT_ORDINAL] = true; + return special; + } + + private static byte[] buildFastTable() { + final byte[] fast = new byte[CLASS_COUNT * CLASS_COUNT]; + for (int last = 0; last < CLASS_COUNT; last++) { + for (int current = 0; current < CLASS_COUNT; current++) { + final int index = last * CLASS_COUNT + current; + fast[index] = SPECIAL[current] ? GO_SLOW : TRANSITION[index]; + } + } + return fast; + } + + private static byte[] buildTransitionTable() { + final byte[] table = new byte[CLASS_COUNT * CLASS_COUNT]; + for (final WordBreak last : CLASSES) { + for (final WordBreak current : CLASSES) { + table[last.ordinal() * CLASS_COUNT + current.ordinal()] = deriveDecision(last, current); + } + } + return table; + } + + // Returns the constant WB5-WB999 decision for a (last, current) pair, or CONSULT if afterPrefix + // gives different answers for different secondLast, next, or parity values. + private static byte deriveDecision(WordBreak last, WordBreak current) { + Boolean constant = null; + for (final WordBreak secondLast : CLASSES) { + for (final WordBreak next : CLASSES) { + for (int parity = 0; parity <= 1; parity++) { + final boolean decision = afterPrefix(current, last, secondLast, next, parity); + if (constant == null) { + constant = decision; + } else if (constant != decision) { + return CONSULT; + } + } + } + } + return constant ? BREAK : NO_BREAK; + } + + /** + * Streams the word segments of {@code text} to {@code consumer} in order, allocating nothing. + * Each segment is delivered as the half-open character range {@code [start, end)}; the segments + * are contiguous and together cover the whole text. + * + * @param text The text to segment. + * @param consumer The receiver of the segment ranges. + */ + public static void forEachSegment(CharSequence text, SegmentConsumer consumer) { + final int length = text.length(); + if (length == 0) { + return; + } + + final int firstCp = Character.codePointAt(text, 0); + int prev = WordBreakProperty.ordinalOf(firstCp); + boolean prevSpecial = SPECIAL[prev]; + int last = OTHER_ORDINAL; + int secondLast = OTHER_ORDINAL; + int regionalIndicatorRun = 0; + if (!isIgnorable(prev)) { + last = prev; + regionalIndicatorRun = prev == REGIONAL_INDICATOR_ORDINAL ? 1 : 0; + } + + int segmentStart = 0; + int i = Character.charCount(firstCp); + while (i < length) { + final int codePoint = Character.codePointAt(text, i); + final int charCount = Character.charCount(codePoint); + final int current = WordBreakProperty.ordinalOf(codePoint); + + // One table read per character. It is the decision for the common case and, as GO_SLOW, the + // "current is special" flag; combined with the carried prevSpecial it avoids the two SPECIAL + // look-ups the rules would otherwise need. + final byte action = FAST[last * CLASS_COUNT + current]; + final boolean currentSpecial = action == GO_SLOW; + final boolean breakHere; + if (prevSpecial || currentSpecial) { + breakHere = breakAtSpecial(prev, current, codePoint, last, secondLast, + regionalIndicatorRun, text, i + charCount, length); + } else { + breakHere = action == CONSULT + ? consult(text, i + charCount, length, current, last, secondLast, regionalIndicatorRun) + : action == BREAK; + } + + if (breakHere) { + consumer.accept(segmentStart, i); + segmentStart = i; + } + + if (!isIgnorable(current)) { + secondLast = last; + last = current; + regionalIndicatorRun = current == REGIONAL_INDICATOR_ORDINAL ? regionalIndicatorRun + 1 : 0; + } + prev = current; + prevSpecial = currentSpecial; + i += charCount; + } + consumer.accept(segmentStart, length); + } + + // Handles a position where the previous or current class is special: applies the WB3 family and + // WB4 (which depend on the immediately preceding code point), then falls back to the transition + // table for the WB5-WB999 rules. + private static boolean breakAtSpecial(int prev, int current, int codePoint, int last, + int secondLast, int regionalIndicatorRun, CharSequence text, int nextFrom, int length) { + if (prev == CR_ORDINAL && current == LF_ORDINAL) { + return false; // WB3 + } + if (prev == CR_ORDINAL || prev == LF_ORDINAL || prev == NEWLINE_ORDINAL) { + return true; // WB3a + } + if (current == CR_ORDINAL || current == LF_ORDINAL || current == NEWLINE_ORDINAL) { + return true; // WB3b + } + if (prev == ZWJ_ORDINAL && ExtendedPictographic.is(codePoint)) { + return false; // WB3c + } + if (prev == WSEG_SPACE_ORDINAL && current == WSEG_SPACE_ORDINAL) { + return false; // WB3d + } + if (current == EXTEND_ORDINAL || current == FORMAT_ORDINAL || current == ZWJ_ORDINAL) { + return false; // WB4 + } + final byte action = TRANSITION[last * CLASS_COUNT + current]; + return action == CONSULT + ? consult(text, nextFrom, length, current, last, secondLast, regionalIndicatorRun) + : action == BREAK; + } + + // Resolves a CONSULT cell: a look-ahead (WB6/WB7b/WB12) or parity (WB15/WB16) rule applies, so + // the next significant value is read (the only place it is needed) and the full cascade is run. + private static boolean consult(CharSequence text, int nextFrom, int length, int current, + int last, int secondLast, int regionalIndicatorRun) { + final WordBreak next = nextSignificant(text, nextFrom, length); + return afterPrefix(CLASSES[current], CLASSES[last], CLASSES[secondLast], next, + regionalIndicatorRun); + } + + /** + * Returns the word boundary character offsets in {@code text}, in ascending order, including the + * boundaries at {@code 0} and {@code text.length()}. + * + * @param text The text to segment. + * @return The boundary offsets; for empty text, {@code [0]}. + */ + public static int[] boundaries(CharSequence text) { + if (text.length() == 0) { + return new int[] {0}; + } + final IntList offsets = new IntList(); + offsets.add(0); // WB1: break at start of text. + // Boundaries are 0 followed by every segment end; the last end is the text length (WB2). + forEachSegment(text, (start, end) -> offsets.add(end)); + return offsets.toArray(); + } + + /** + * Returns the word segments of {@code text} as spans between consecutive boundaries. + * + * @param text The text to segment. + * @return The segment spans, in order. + */ + public static List segments(CharSequence text) { + final List spans = new ArrayList<>(); + forEachSegment(text, (start, end) -> spans.add(new Span(start, end))); + return spans; + } + + // The Word_Break value of the next non-ignorable code point at or after "from" (else OTHER). + private static WordBreak nextSignificant(CharSequence text, int from, int length) { + for (int j = from; j < length; ) { + final int codePoint = Character.codePointAt(text, j); + final WordBreak value = WordBreakProperty.of(codePoint); + if (!isIgnorable(value)) { + return value; + } + j += Character.charCount(codePoint); + } + return WordBreak.OTHER; + } + + // Applies WB5 through WB999. These rules depend only on the last two significant values, the + // current value, the next significant value, and the regional-indicator parity, never on the + // immediately preceding code point; that is what lets them be captured by the transition table. + // "last"/"secondLast" skip the WB4-absorbed characters. + private static boolean afterPrefix(WordBreak current, WordBreak last, WordBreak secondLast, + WordBreak next, int regionalIndicatorRun) { + if (isAhLetter(last) && isAhLetter(current)) { + return false; // WB5 + } + if (isAhLetter(last) && isMidLetter(current) && isAhLetter(next)) { + return false; // WB6 + } + if (isAhLetter(secondLast) && isMidLetter(last) && isAhLetter(current)) { + return false; // WB7 + } + if (last == WordBreak.HEBREW_LETTER && current == WordBreak.SINGLE_QUOTE) { + return false; // WB7a + } + if (last == WordBreak.HEBREW_LETTER && current == WordBreak.DOUBLE_QUOTE + && next == WordBreak.HEBREW_LETTER) { + return false; // WB7b + } + if (secondLast == WordBreak.HEBREW_LETTER && last == WordBreak.DOUBLE_QUOTE + && current == WordBreak.HEBREW_LETTER) { + return false; // WB7c + } + if (last == WordBreak.NUMERIC && current == WordBreak.NUMERIC) { + return false; // WB8 + } + if (isAhLetter(last) && current == WordBreak.NUMERIC) { + return false; // WB9 + } + if (last == WordBreak.NUMERIC && isAhLetter(current)) { + return false; // WB10 + } + if (secondLast == WordBreak.NUMERIC && isMidNumber(last) && current == WordBreak.NUMERIC) { + return false; // WB11 + } + if (last == WordBreak.NUMERIC && isMidNumber(current) && next == WordBreak.NUMERIC) { + return false; // WB12 + } + if (last == WordBreak.KATAKANA && current == WordBreak.KATAKANA) { + return false; // WB13 + } + if ((isAhLetter(last) || last == WordBreak.NUMERIC || last == WordBreak.KATAKANA + || last == WordBreak.EXTEND_NUM_LET) && current == WordBreak.EXTEND_NUM_LET) { + return false; // WB13a + } + if (last == WordBreak.EXTEND_NUM_LET && (isAhLetter(current) || current == WordBreak.NUMERIC + || current == WordBreak.KATAKANA)) { + return false; // WB13b + } + if (current == WordBreak.REGIONAL_INDICATOR && last == WordBreak.REGIONAL_INDICATOR + && (regionalIndicatorRun & 1) == 1) { + return false; // WB15 / WB16 + } + return true; // WB999 + } + + private static boolean isIgnorable(WordBreak value) { + return value == WordBreak.EXTEND || value == WordBreak.FORMAT || value == WordBreak.ZWJ; + } + + private static boolean isIgnorable(int ordinal) { + return ordinal == EXTEND_ORDINAL || ordinal == FORMAT_ORDINAL || ordinal == ZWJ_ORDINAL; + } + + private static boolean isAhLetter(WordBreak value) { + return value == WordBreak.ALETTER || value == WordBreak.HEBREW_LETTER; + } + + // MidLetter | MidNumLet | Single_Quote (the "MidLetterQ" set used by WB6 and WB7). + private static boolean isMidLetter(WordBreak value) { + return value == WordBreak.MID_LETTER || value == WordBreak.MID_NUM_LET + || value == WordBreak.SINGLE_QUOTE; + } + + // MidNum | MidNumLet | Single_Quote (the set used by WB11 and WB12). + private static boolean isMidNumber(WordBreak value) { + return value == WordBreak.MID_NUM || value == WordBreak.MID_NUM_LET + || value == WordBreak.SINGLE_QUOTE; + } + + // A minimal growable int array, so boundaries() makes one backing allocation instead of one per + // boundary (an ArrayList would box every offset). + private static final class IntList { + private int[] values = new int[16]; + private int size; + + void add(int value) { + if (size == values.length) { + values = Arrays.copyOf(values, values.length * 2); + } + values[size++] = value; + } + + int[] toArray() { + return Arrays.copyOf(values, size); + } + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordToken.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordToken.java new file mode 100644 index 000000000..0e07287db --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordToken.java @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.tokenize.uax29; + +import opennlp.tools.util.Span; + +/** + * A word token produced by {@link WordTokenizer}: the character {@link Span} into the source text + * together with its {@link WordType}. + * + * @param span The character offsets of the token in the source text. + * @param type The token category. + */ +public record WordToken(Span span, WordType type) { + + /** + * Returns the covered text of this token. + * + * @param source The text this token was produced from. + * @return The covered text. + */ + public CharSequence text(CharSequence source) { + return span.getCoveredText(source); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordTokenizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordTokenizer.java new file mode 100644 index 000000000..8ab9189c1 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordTokenizer.java @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.tokenize.uax29; + +import java.util.ArrayList; +import java.util.List; + +import opennlp.tools.tokenize.Tokenizer; +import opennlp.tools.util.Span; + +/** + * A word tokenizer built on the Unicode Text Segmentation algorithm (UAX #29). It finds segments + * with {@link WordSegmenter}, keeps the ones that are words (letters, digits, ideographs, kana, + * Hangul, Southeast-Asian script, or emoji), drops whitespace and punctuation, and classifies each + * kept token with a {@link WordType}. + * + *

A token longer than {@code maxTokenLength} is emitted as consecutive pieces, never splitting a + * surrogate pair. The tokenizer reports offset {@link Span}s, so the original text and its character + * offsets are preserved for downstream normalization.

+ * + *

It implements {@link Tokenizer}: {@link #tokenize(String)} returns the token strings and + * {@link #tokenizePos(String)} their offsets. {@link #tokenizeTyped(CharSequence)} additionally + * carries each token's {@link WordType}, and {@link #tokenize(CharSequence, TokenHandler)} streams + * tokens with no per-token allocation. Instances are immutable and thread-safe.

+ */ +public final class WordTokenizer implements Tokenizer { + + /** Receives each word token as a character range and its type, with no allocation. */ + @FunctionalInterface + public interface TokenHandler { + /** + * Accepts one word token. + * + * @param start The inclusive start character offset. + * @param end The exclusive end character offset. + * @param type The token category. + */ + void token(int start, int end, WordType type); + } + + /** The default maximum token length. */ + public static final int DEFAULT_MAX_TOKEN_LENGTH = 255; + + private final int maxTokenLength; + + /** + * Creates a tokenizer with the {@linkplain #DEFAULT_MAX_TOKEN_LENGTH default} maximum token + * length. + */ + public WordTokenizer() { + this(DEFAULT_MAX_TOKEN_LENGTH); + } + + /** + * Creates a tokenizer with the given maximum token length. + * + * @param maxTokenLength The maximum number of characters in a token; longer tokens are chopped + * into consecutive pieces. Must be at least {@code 1}. + * @throws IllegalArgumentException if {@code maxTokenLength} is less than {@code 1}. + */ + public WordTokenizer(int maxTokenLength) { + if (maxTokenLength < 1) { + throw new IllegalArgumentException("maxTokenLength must be at least 1, got " + maxTokenLength); + } + this.maxTokenLength = maxTokenLength; + } + + /** + * Streams the word tokens of {@code text} to {@code handler} in order, allocating nothing. + * + * @param text The text to tokenize. + * @param handler The receiver of the tokens. + */ + public void tokenize(CharSequence text, TokenHandler handler) { + WordSegmenter.forEachSegment(text, (start, end) -> { + final WordType type = WordType.of(text, start, end); + if (type != null) { + emit(text, start, end, type, handler); + } + }); + } + + /** + * Returns the word tokens of {@code s} as strings, in order. + * + * @param s The text to tokenize. + * @return The token strings. + */ + @Override + public String[] tokenize(String s) { + final List tokens = new ArrayList<>(); + tokenize(s, (start, end, type) -> tokens.add(s.substring(start, end))); + return tokens.toArray(new String[0]); + } + + /** + * Returns the offset spans of the word tokens of {@code s}, in order. + * + * @param s The text to tokenize. + * @return The token spans. + */ + @Override + public Span[] tokenizePos(String s) { + final List spans = tokenizeSpans(s); + return spans.toArray(new Span[0]); + } + + /** + * Returns the offset spans of the word tokens in {@code text}, in order. + * + * @param text The text to tokenize. + * @return The word-token spans. + */ + public List tokenizeSpans(CharSequence text) { + final List spans = new ArrayList<>(); + tokenize(text, (start, end, type) -> spans.add(new Span(start, end))); + return spans; + } + + /** + * Returns the word tokens in {@code text}, each carrying its {@link WordType}, in order. + * + * @param text The text to tokenize. + * @return The typed word tokens. + */ + public List tokenizeTyped(CharSequence text) { + final List tokens = new ArrayList<>(); + tokenize(text, (start, end, type) -> tokens.add(new WordToken(new Span(start, end), type))); + return tokens; + } + + // Emits [start, end) as one or more tokens no longer than maxTokenLength, never splitting a + // surrogate pair. The whole word is classified once and every piece carries that type. + private void emit(CharSequence text, int start, int end, WordType type, TokenHandler handler) { + int from = start; + while (end - from > maxTokenLength) { + int cut = from + maxTokenLength; + if (Character.isHighSurrogate(text.charAt(cut - 1))) { + cut--; // keep the surrogate pair together + } + if (cut <= from) { + // maxTokenLength is shorter than the leading code point; emit it whole rather than stall. + cut = from + Character.charCount(Character.codePointAt(text, from)); + } + handler.token(from, cut, type); + from = cut; + } + if (from < end) { + handler.token(from, end, type); + } + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordType.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordType.java new file mode 100644 index 000000000..652772069 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/uax29/WordType.java @@ -0,0 +1,148 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.tokenize.uax29; + +/** + * The category of a {@linkplain WordTokenizer word token}. {@link #ALPHANUMERIC} and + * {@link #NUMERIC} cover letter and digit words; the remaining categories identify scripts and + * emoji that benefit from script-specific handling. The boundaries themselves follow the Unicode + * release shipped with {@link WordSegmenter}. + */ +public enum WordType { + + /** A token that contains at least one letter (optionally mixed with digits and connectors). */ + ALPHANUMERIC, + + /** A token made up entirely of digits and numeric connectors. */ + NUMERIC, + + /** A single Han ideograph. */ + IDEOGRAPHIC, + + /** A Hiragana token. */ + HIRAGANA, + + /** A Katakana token. */ + KATAKANA, + + /** A Hangul token. */ + HANGUL, + + /** A token in a Southeast Asian script that requires dictionary segmentation (Thai, Lao, ...). */ + SOUTHEAST_ASIAN, + + /** An emoji, emoji sequence, or regional-indicator flag. */ + EMOJI; + + private static final int REGIONAL_INDICATOR_FIRST = 0x1F1E6; + private static final int REGIONAL_INDICATOR_LAST = 0x1F1FF; + + // No code point below this can belong to a script-specific category (the lowest is Thai, U+0E00), + // so Latin, Greek, Cyrillic, and ASCII text skips the relatively costly script lookup entirely. + private static final int LOWEST_SCRIPT_CODE_POINT = 0x0E00; + + // ASCII kind: 0 = neither, 1 = letter, 2 = digit. No ASCII code point is pictographic or in a + // script-specific category, so ASCII characters skip those tests and the Character.isLetter / + // isDigit general-category look-ups entirely. + private static final byte[] ASCII_KIND = buildAsciiKind(); + + private static byte[] buildAsciiKind() { + final byte[] kind = new byte[0x80]; + for (int c = '0'; c <= '9'; c++) { + kind[c] = 2; + } + for (int c = 'A'; c <= 'Z'; c++) { + kind[c] = 1; + } + for (int c = 'a'; c <= 'z'; c++) { + kind[c] = 1; + } + return kind; + } + + // Classifies the code points in text over [start, end) as a word token type, or returns null + // when the range is not a word (pure whitespace, punctuation, or symbols). Emoji win over + // scripts, scripts over the generic alphanumeric/numeric split. + static WordType of(CharSequence text, int start, int end) { + boolean hasLetter = false; + boolean hasDigit = false; + WordType script = null; + for (int i = start; i < end; ) { + final int codePoint = Character.codePointAt(text, i); + i += Character.charCount(codePoint); + if (codePoint < 0x80) { + final int kind = ASCII_KIND[codePoint]; + if (kind == 1) { + hasLetter = true; + } else if (kind == 2) { + hasDigit = true; + } + continue; + } + if (ExtendedPictographic.is(codePoint) || isRegionalIndicator(codePoint)) { + return EMOJI; + } + if (codePoint >= LOWEST_SCRIPT_CODE_POINT && script == null) { + script = scriptType(codePoint); + } + if (Character.isLetter(codePoint)) { + hasLetter = true; + } else if (Character.isDigit(codePoint)) { + hasDigit = true; + } + } + if (script != null) { + return script; + } + if (hasLetter) { + return ALPHANUMERIC; + } + if (hasDigit) { + return NUMERIC; + } + return null; + } + + private static boolean isRegionalIndicator(int codePoint) { + return codePoint >= REGIONAL_INDICATOR_FIRST && codePoint <= REGIONAL_INDICATOR_LAST; + } + + // Maps a code point to a script-specific token type, or null for scripts (Latin, Greek, ...) that + // fall through to the generic alphanumeric category. + private static WordType scriptType(int codePoint) { + switch (Character.UnicodeScript.of(codePoint)) { + case HAN: + return IDEOGRAPHIC; + case HIRAGANA: + return HIRAGANA; + case KATAKANA: + return KATAKANA; + case HANGUL: + return HANGUL; + case THAI: + case LAO: + case MYANMAR: + case KHMER: + case TAI_LE: + case NEW_TAI_LUE: + case TAI_VIET: + return SOUTHEAST_ASIAN; + default: + return null; + } + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/CaseFoldCharSequenceNormalizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/CaseFoldCharSequenceNormalizer.java index 176dd108b..e4d442123 100644 --- a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/CaseFoldCharSequenceNormalizer.java +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/CaseFoldCharSequenceNormalizer.java @@ -17,16 +17,18 @@ package opennlp.tools.util.normalizer; import java.util.Locale; +import java.util.Objects; /** - * A {@link CharSequenceNormalizer} that lower cases text for case-insensitive matching, using - * {@link Locale#ROOT} so the result does not depend on the JVM's default locale. + * A {@link CharSequenceNormalizer} that lower cases text for case-insensitive matching. It uses + * {@link Locale#ROOT} by default, so the result does not depend on the JVM's default locale. * *

This is the case-folding step of a search / BM25 analysis chain (the counterpart to Lucene's * lower-case filter). {@code Locale.ROOT} avoids locale surprises such as the Turkish dotless-i - * mapping; callers that need language-specific case rules should fold with an explicit locale - * upstream. Full Unicode case folding (for example German eszett, {@code U+00DF}, to {@code ss}) - * is a distinct, heavier transform and is intentionally out of scope here.

+ * mapping. A specific locale can be supplied through {@link #CaseFoldCharSequenceNormalizer(Locale)} + * or {@link #getInstance(Locale)} when a language's case rules are wanted (Turkish being the classic + * example). Full Unicode case folding (for example German eszett, {@code U+00DF}, to {@code ss}) is + * a distinct, heavier transform and is intentionally out of scope here.

*/ public class CaseFoldCharSequenceNormalizer implements CharSequenceNormalizer { @@ -35,13 +37,40 @@ public class CaseFoldCharSequenceNormalizer implements CharSequenceNormalizer { private static final CaseFoldCharSequenceNormalizer INSTANCE = new CaseFoldCharSequenceNormalizer(); - /** {@return the shared, stateless instance} */ + /** The locale whose case rules are applied. */ + private final Locale locale; + + /** Creates a normalizer that lower cases using {@link Locale#ROOT}. */ + public CaseFoldCharSequenceNormalizer() { + this(Locale.ROOT); + } + + /** + * Creates a normalizer that lower cases using the given locale. + * + * @param locale The locale whose case rules to apply. + */ + public CaseFoldCharSequenceNormalizer(Locale locale) { + this.locale = Objects.requireNonNull(locale, "locale"); + } + + /** {@return the shared, stateless {@link Locale#ROOT} instance} */ public static CaseFoldCharSequenceNormalizer getInstance() { return INSTANCE; } + /** + * {@return a normalizer for the given locale} The shared {@link Locale#ROOT} instance is returned + * for {@code Locale.ROOT}. + * + * @param locale The locale whose case rules to apply. + */ + public static CaseFoldCharSequenceNormalizer getInstance(Locale locale) { + return Locale.ROOT.equals(locale) ? INSTANCE : new CaseFoldCharSequenceNormalizer(locale); + } + @Override public CharSequence normalize(CharSequence text) { - return text.toString().toLowerCase(Locale.ROOT); + return text.toString().toLowerCase(locale); } } diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/ConfusableSkeletonCharSequenceNormalizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/ConfusableSkeletonCharSequenceNormalizer.java new file mode 100644 index 000000000..22912a684 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/ConfusableSkeletonCharSequenceNormalizer.java @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +/** + * A {@link CharSequenceNormalizer} that reduces text to its Unicode confusable + * {@linkplain Confusables#skeleton(CharSequence) skeleton} (UTS #39). + * + *

This maps lookalike characters to a common prototype so that, for example, a word spelled with + * Cyrillic or Greek letters that imitate Latin ones reduces to the same form as its Latin spelling. + * The result is a matching key, not readable text: it is lossy and does not preserve offsets, so it + * is only meaningful as a derived layer of the original/normalized model.

+ */ +public class ConfusableSkeletonCharSequenceNormalizer implements CharSequenceNormalizer { + + private static final long serialVersionUID = 1L; + + private static final ConfusableSkeletonCharSequenceNormalizer INSTANCE = + new ConfusableSkeletonCharSequenceNormalizer(); + + private ConfusableSkeletonCharSequenceNormalizer() { + } + + /** {@return the shared, stateless instance} */ + public static ConfusableSkeletonCharSequenceNormalizer getInstance() { + return INSTANCE; + } + + @Override + public CharSequence normalize(CharSequence text) { + return Confusables.skeleton(text); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/Confusables.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/Confusables.java new file mode 100644 index 000000000..235842d62 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/Confusables.java @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.io.BufferedReader; +import java.io.IOException; +import java.io.InputStream; +import java.io.InputStreamReader; +import java.io.UncheckedIOException; +import java.nio.charset.StandardCharsets; +import java.text.Normalizer; +import java.util.HashMap; +import java.util.Map; + +/** + * Computes the Unicode confusable skeleton of text, following the algorithm in + * UTS #39 (Unicode Security Mechanisms). Two + * strings are confusable, for example Latin {@code "paypal"} and a version using Cyrillic + * lookalikes, exactly when their skeletons are equal. + * + *

The mapping is loaded once from the {@code confusables.txt} resource of the Unicode security + * data (parsed with simple cursor scanning, no regular expression). The skeleton of a string is + * {@code NFD(map(NFD(s)))}: decompose, replace each code point with its prototype, and decompose + * again. This changes length and offsets, so it belongs to the derived, matching-only form rather + * than to any offset-preserving transform.

+ */ +public final class Confusables { + + private static final String RESOURCE = "confusables.txt"; + + // Maps a single confusable code point to its prototype sequence (one or more code points). + private static final Map PROTOTYPES = load(); + + private Confusables() { + } + + private static Map load() { + final Map map = new HashMap<>(12000); + try (InputStream in = Confusables.class.getResourceAsStream(RESOURCE)) { + if (in == null) { + throw new IllegalStateException("Missing confusables data resource: " + RESOURCE); + } + try (BufferedReader reader = + new BufferedReader(new InputStreamReader(in, StandardCharsets.UTF_8))) { + String line; + while ((line = reader.readLine()) != null) { + final int hash = line.indexOf('#'); + final String content = (hash < 0 ? line : line.substring(0, hash)).strip(); + if (content.isEmpty()) { + continue; + } + final int firstSemicolon = content.indexOf(';'); + final int secondSemicolon = content.indexOf(';', firstSemicolon + 1); + if (firstSemicolon < 0 || secondSemicolon < 0) { + continue; + } + final int source = Integer.parseInt(content.substring(0, firstSemicolon).strip(), 16); + final String target = content.substring(firstSemicolon + 1, secondSemicolon).strip(); + final StringBuilder prototype = new StringBuilder(); + for (final String codePoint : target.split("\\s+")) { + prototype.appendCodePoint(Integer.parseInt(codePoint, 16)); + } + map.put(source, prototype.toString()); + } + } + } catch (IOException e) { + throw new UncheckedIOException("Unable to read confusables data resource " + RESOURCE, e); + } + return map; + } + + /** + * Returns the confusable skeleton of {@code text}: {@code NFD(map(NFD(text)))} where {@code map} + * replaces each code point with its UTS #39 prototype. The skeleton is for comparison only; + * it is not human-readable text and does not preserve offsets. + * + * @param text The text to reduce. + * @return The skeleton. + */ + public static String skeleton(CharSequence text) { + final String decomposed = Normalizer.normalize(text, Normalizer.Form.NFD); + final StringBuilder mapped = new StringBuilder(decomposed.length()); + for (int i = 0; i < decomposed.length(); ) { + final int codePoint = decomposed.codePointAt(i); + i += Character.charCount(codePoint); + final String prototype = PROTOTYPES.get(codePoint); + if (prototype != null) { + mapped.append(prototype); + } else { + mapped.appendCodePoint(codePoint); + } + } + return Normalizer.normalize(mapped, Normalizer.Form.NFD); + } + + /** + * {@return whether {@code left} and {@code right} are confusable} They are confusable when their + * {@linkplain #skeleton(CharSequence) skeletons} are equal. + * + * @param left The first string. + * @param right The second string. + */ + public static boolean confusable(CharSequence left, CharSequence right) { + return skeleton(left).equals(skeleton(right)); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/Dimension.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/Dimension.java new file mode 100644 index 000000000..56a9fd629 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/Dimension.java @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +/** + * A layer of the {@link Term} normalization stack, in increasing order of aggressiveness. A + * {@link TermAnalyzer} applies a configured prefix of these to each token; the declaration order is + * the canonical pipeline order, because the transforms do not commute (case folding then accent + * folding differs from the reverse for Turkish dotted/dotless i and the German eszett). + * + *

{@link #ORIGINAL} is the source token and is always present. The character-level dimensions + * have a default transform and can therefore be requested from any term. {@link #STEM} and + * {@link #LEMMA} are token-level and require a {@link opennlp.tools.stemmer.Stemmer} or + * {@link opennlp.tools.lemmatizer.Lemmatizer} to be configured on the analyzer; {@link #LEMMA} also + * requires a part-of-speech tag.

+ */ +public enum Dimension { + + /** The original token text, the canonical source of truth. */ + ORIGINAL, + + /** Unicode canonical composition (NFC); lossless under canonical equivalence. */ + NFC, + + /** Unicode compatibility composition (NFKC); lossy (for example superscripts to digits). */ + NFKC, + + /** Unicode whitespace folded to ASCII spaces. */ + WHITESPACE, + + /** Unicode dashes folded to the ASCII hyphen-minus. */ + DASH, + + /** Case folding; lossy and locale sensitive. */ + CASE_FOLD, + + /** Diacritic and accent folding; lossy, script gated, and language-wrong for some languages. */ + ACCENT_FOLD, + + /** Confusable (homoglyph) skeleton folding per UTS #39; lossy, for matching only. */ + CONFUSABLE_FOLD, + + /** Stemming through a configured {@link opennlp.tools.stemmer.Stemmer}. */ + STEM, + + /** Lemmatization through a configured {@link opennlp.tools.lemmatizer.Lemmatizer}. */ + LEMMA +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/GermanUmlautCharSequenceNormalizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/GermanUmlautCharSequenceNormalizer.java new file mode 100644 index 000000000..170c240a4 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/GermanUmlautCharSequenceNormalizer.java @@ -0,0 +1,88 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +/** + * A {@link CharSequenceNormalizer} that transliterates German umlauts and the eszett the way German + * conventionally expands them (DIN 5007-2): a-umlaut to {@code ae}, o-umlaut to {@code oe}, + * u-umlaut to {@code ue}, and eszett to {@code ss}, with the capital umlauts expanded likewise. + * + *

This is the correct diacritic fold for German, where the generic + * {@link AccentFoldCharSequenceNormalizer} (which would yield {@code a}, {@code o}, {@code u}) is + * wrong. It is an expanding, offset-changing transform, so like the other folds it belongs to the + * derived matching form rather than to anything offset-preserving. A cursor pass with no regular + * expression.

+ */ +public class GermanUmlautCharSequenceNormalizer implements CharSequenceNormalizer { + + private static final long serialVersionUID = 1L; + + private static final int SMALL_A_UMLAUT = 0x00E4; + private static final int SMALL_O_UMLAUT = 0x00F6; + private static final int SMALL_U_UMLAUT = 0x00FC; + private static final int CAPITAL_A_UMLAUT = 0x00C4; + private static final int CAPITAL_O_UMLAUT = 0x00D6; + private static final int CAPITAL_U_UMLAUT = 0x00DC; + private static final int ESZETT = 0x00DF; + + private static final GermanUmlautCharSequenceNormalizer INSTANCE = + new GermanUmlautCharSequenceNormalizer(); + + private GermanUmlautCharSequenceNormalizer() { + } + + /** {@return the shared, stateless instance} */ + public static GermanUmlautCharSequenceNormalizer getInstance() { + return INSTANCE; + } + + @Override + public CharSequence normalize(CharSequence text) { + final int length = text.length(); + final StringBuilder out = new StringBuilder(length + 4); + for (int i = 0; i < length; i++) { + final char c = text.charAt(i); + switch (c) { + case SMALL_A_UMLAUT: + out.append("ae"); + break; + case SMALL_O_UMLAUT: + out.append("oe"); + break; + case SMALL_U_UMLAUT: + out.append("ue"); + break; + case CAPITAL_A_UMLAUT: + out.append("Ae"); + break; + case CAPITAL_O_UMLAUT: + out.append("Oe"); + break; + case CAPITAL_U_UMLAUT: + out.append("Ue"); + break; + case ESZETT: + out.append("ss"); + break; + default: + out.append(c); + break; + } + } + return out.toString(); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/NormalizationProfile.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/NormalizationProfile.java new file mode 100644 index 000000000..e45d4231d --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/NormalizationProfile.java @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import opennlp.tools.stemmer.Stemmer; +import opennlp.tools.stemmer.snowball.SnowballStemmer; + +/** + * Per-language normalization settings, mirroring how OpenNLP already selects a Snowball stemmer by + * language. A profile pairs a language with its Snowball {@link SnowballStemmer.ALGORITHM} and the + * diacritic fold appropriate for that language (if any). + * + *

The {@code accentFold} normalizer is the language's diacritic transform for a search form, or + * {@code null} when folding is not appropriate. It is the generic + * {@link AccentFoldCharSequenceNormalizer} for English and the major Romance languages (where + * accented letters are search variants of their base letter), the German-specific + * {@link GermanUmlautCharSequenceNormalizer} (a-umlaut to {@code ae}, eszett to {@code ss}, ...) for + * German, and {@code null} where diacritics mark distinct letters (the Nordic languages and the + * non-Latin scripts), because folding there is language-wrong. This is a search-recall choice, not a + * statement of linguistic correctness; callers can build a {@link TermAnalyzer} directly to + * override it.

+ * + * @param language The language, as an ISO 639-3 code (for example {@code "eng"}). + * @param stemmerAlgorithm The Snowball algorithm for the language. + * @param accentFold The diacritic fold for the language, or {@code null} for none. + */ +public record NormalizationProfile(String language, SnowballStemmer.ALGORITHM stemmerAlgorithm, + CharSequenceNormalizer accentFold) { + + /** + * {@return a new {@link Stemmer} for this language} A fresh instance is returned on each call + * because the Snowball stemmers are stateful and not thread-safe. + */ + public Stemmer newStemmer() { + return new SnowballStemmer(stemmerAlgorithm); + } + + /** + * Returns a search-oriented analyzer for this language: NFC, case folding, the language's + * {@linkplain #accentFold() diacritic fold} when it has one, then stemming. Each call builds an + * independent analyzer with its own stemmer, so use one per thread when stemming. + * + * @return the analyzer. + */ + public TermAnalyzer searchAnalyzer() { + final TermAnalyzer.Builder builder = TermAnalyzer.builder().nfc().caseFold(); + if (accentFold != null) { + builder.transform(Dimension.ACCENT_FOLD, accentFold); + } + return builder.stem(newStemmer()).build(); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/NormalizationProfiles.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/NormalizationProfiles.java new file mode 100644 index 000000000..4cd93f287 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/NormalizationProfiles.java @@ -0,0 +1,118 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.HashMap; +import java.util.Locale; +import java.util.Map; +import java.util.MissingResourceException; +import java.util.Optional; +import java.util.Set; + +import opennlp.tools.langdetect.LanguageDetector; +import opennlp.tools.stemmer.snowball.SnowballStemmer; + +/** + * A registry of {@link NormalizationProfile}s by language, with detection-based fallback. This is + * the language dispatch the design note calls for: pick the profile for a requested language, or + * detect the language with a {@link LanguageDetector} when it is unspecified. The covered languages + * are exactly those with a Snowball stemmer. + * + *

Profiles are keyed by ISO 639-3 code (what {@link LanguageDetector} produces); + * {@link #forLanguage(String)} also accepts ISO 639-1 two-letter codes.

+ */ +public final class NormalizationProfiles { + + private static final Map BY_LANGUAGE = build(); + + private NormalizationProfiles() { + } + + private static Map build() { + final Map map = new HashMap<>(); + // The generic accent fold is used for English and the major Romance languages, German uses its + // own ae/oe/ue/ss fold, and folding is disabled elsewhere (Nordic, non-Latin) where diacritics + // mark distinct letters. + final CharSequenceNormalizer latin = AccentFoldCharSequenceNormalizer.getInstance(); + final CharSequenceNormalizer german = GermanUmlautCharSequenceNormalizer.getInstance(); + add(map, "ara", SnowballStemmer.ALGORITHM.ARABIC, null); + add(map, "cat", SnowballStemmer.ALGORITHM.CATALAN, latin); + add(map, "dan", SnowballStemmer.ALGORITHM.DANISH, null); + add(map, "deu", SnowballStemmer.ALGORITHM.GERMAN, german); + add(map, "ell", SnowballStemmer.ALGORITHM.GREEK, null); + add(map, "eng", SnowballStemmer.ALGORITHM.ENGLISH, latin); + add(map, "fin", SnowballStemmer.ALGORITHM.FINNISH, null); + add(map, "fra", SnowballStemmer.ALGORITHM.FRENCH, latin); + add(map, "gle", SnowballStemmer.ALGORITHM.IRISH, null); + add(map, "hun", SnowballStemmer.ALGORITHM.HUNGARIAN, null); + add(map, "ind", SnowballStemmer.ALGORITHM.INDONESIAN, null); + add(map, "ita", SnowballStemmer.ALGORITHM.ITALIAN, latin); + add(map, "nld", SnowballStemmer.ALGORITHM.DUTCH, null); + add(map, "nor", SnowballStemmer.ALGORITHM.NORWEGIAN, null); + add(map, "por", SnowballStemmer.ALGORITHM.PORTUGUESE, latin); + add(map, "ron", SnowballStemmer.ALGORITHM.ROMANIAN, null); + add(map, "rus", SnowballStemmer.ALGORITHM.RUSSIAN, null); + add(map, "spa", SnowballStemmer.ALGORITHM.SPANISH, latin); + add(map, "swe", SnowballStemmer.ALGORITHM.SWEDISH, null); + return Map.copyOf(map); + } + + private static void add(Map map, String language, + SnowballStemmer.ALGORITHM algorithm, CharSequenceNormalizer accentFold) { + map.put(language, new NormalizationProfile(language, algorithm, accentFold)); + } + + /** + * Returns the profile for a language. + * + * @param language An ISO 639-3 or ISO 639-1 language code; case-insensitive. + * @return The profile, or empty if the language has no Snowball stemmer. + */ + public static Optional forLanguage(String language) { + String code = language.strip().toLowerCase(Locale.ROOT); + if (code.length() == 2) { + try { + final String iso3 = Locale.of(code).getISO3Language(); + if (!iso3.isEmpty()) { + code = iso3; + } + } catch (MissingResourceException ignored) { + // No ISO 639-3 code for this two-letter code; fall through and look up as given. + } + } + return Optional.ofNullable(BY_LANGUAGE.get(code)); + } + + /** + * Detects the language of {@code text} and returns its profile. + * + * @param text The text to detect. + * @param detector The language detector to use. + * @return The profile for the detected language, or empty if it has no Snowball stemmer. + */ + public static Optional detect(CharSequence text, + LanguageDetector detector) { + return forLanguage(detector.predictLanguage(text).getLang()); + } + + /** + * {@return the ISO 639-3 codes of the supported languages} + */ + public static Set supportedLanguages() { + return BY_LANGUAGE.keySet(); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/Term.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/Term.java new file mode 100644 index 000000000..08d8793f1 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/Term.java @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.EnumMap; +import java.util.List; + +import opennlp.tools.util.Span; + +/** + * One token as a stack of normalization layers. The {@link #original()} form is the canonical + * source of truth; the other layers are derived, increasingly aggressive {@link Dimension}s tuned + * for matching and search. The dimensions configured on the producing {@link TermAnalyzer} are + * computed eagerly and cached; any other dimension is computed on first request, applied on top of + * the {@link #normalized() configured form}, and then cached. + * + *

Because the original is always retained, aggressive folding is safe: a match on a derived layer + * can always be reported in original coordinates through {@link #span()}. Querying a configured + * layer, or {@link #peel() peeling} the last-applied one, is O(1); adding an unconfigured dimension + * costs one transform on first touch and is O(1) thereafter.

+ * + *

Instances are created by {@link TermAnalyzer} and are not thread-safe (the lazy cache is + * mutated on first access of an unconfigured dimension).

+ */ +public final class Term { + + private final TermAnalyzer analyzer; + private final Span span; + private final String posTag; + private final EnumMap layers = new EnumMap<>(Dimension.class); + + Term(TermAnalyzer analyzer, String original, Span span, String posTag) { + this.analyzer = analyzer; + this.span = span; + this.posTag = posTag; + String value = original; + layers.put(Dimension.ORIGINAL, value); + for (final Dimension dimension : analyzer.dimensions()) { + value = analyzer.apply(dimension, value, posTag); + layers.put(dimension, value); + } + } + + /** + * {@return the source span of this token, or {@code null} if it was supplied as a pre-tokenized + * string} The span indexes into the text passed to {@link TermAnalyzer#analyze(CharSequence)}. + */ + public Span span() { + return span; + } + + /** + * {@return the original token text} + */ + public String original() { + return layers.get(Dimension.ORIGINAL); + } + + /** + * {@return the token at the analyzer's final configured dimension} Equal to {@link #original()} + * when no dimensions were configured. + */ + public String normalized() { + return at(analyzer.finalDimension()); + } + + /** + * Returns the token at {@code dimension}. Configured dimensions are cached; an unconfigured + * dimension is computed by applying its transform to {@link #normalized()} and then cached. + * + * @param dimension The dimension to project to. + * @return The token at that dimension. + * @throws IllegalStateException if the dimension needs an engine or tag that was not configured + * (see {@link Dimension#STEM} and {@link Dimension#LEMMA}). + */ + public String at(Dimension dimension) { + final String cached = layers.get(dimension); + if (cached != null) { + return cached; + } + final String value = analyzer.apply(dimension, normalized(), posTag); + layers.put(dimension, value); + return value; + } + + /** + * {@return the token at the dimension just below the final configured one} This is the + * last-applied layer removed (for example the form before stemming when {@link Dimension#STEM} + * is the final dimension); equal to {@link #original()} when at most one dimension is configured. + */ + public String peel() { + final List dimensions = analyzer.dimensions(); + if (dimensions.size() < 2) { + return original(); + } + return at(dimensions.get(dimensions.size() - 2)); + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/TermAnalyzer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/TermAnalyzer.java new file mode 100644 index 000000000..b08da0a9b --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/TermAnalyzer.java @@ -0,0 +1,372 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.EnumMap; +import java.util.EnumSet; +import java.util.List; +import java.util.Locale; +import java.util.Objects; +import java.util.Set; + +import opennlp.tools.lemmatizer.Lemmatizer; +import opennlp.tools.stemmer.Stemmer; +import opennlp.tools.tokenize.uax29.WordTokenizer; +import opennlp.tools.util.Span; + +/** + * Builds {@link Term}s by segmenting text and applying a configured stack of normalization + * {@link Dimension}s to each token. The analyzer is the configuration; each {@link Term} is the + * layered result for one token, with the configured dimensions computed eagerly and any other + * dimension computed lazily on first request. + * + *

Segmentation uses the Unicode {@linkplain WordTokenizer UAX #29 word tokenizer}, so the + * input does not need to be pre-tokenized. The character-level dimensions ({@link Dimension#NFC} + * through {@link Dimension#ACCENT_FOLD}) have built-in defaults; {@link Dimension#STEM} and + * {@link Dimension#LEMMA} are enabled by supplying a {@link Stemmer} or {@link Lemmatizer}.

+ * + *

An instance is immutable and is thread-safe when its configured transforms are. The built-in + * character normalizers are stateless, but the Snowball stemmers are not, so an analyzer configured + * with a {@link Stemmer} (for example through {@link NormalizationProfile#searchAnalyzer()}) should + * not be shared across threads when {@link Dimension#STEM} is used. Build one with + * {@link #builder()}.

+ */ +public final class TermAnalyzer { + + private final List chain; + private final Dimension finalDimension; + private final EnumMap transforms; + private final Stemmer stemmer; + private final Lemmatizer lemmatizer; + private final WordTokenizer tokenizer; + + private TermAnalyzer(Builder builder) { + final List ordered = new ArrayList<>(builder.chain); + Collections.sort(ordered); // canonical pipeline order (enum declaration order) + this.chain = List.copyOf(ordered); + this.finalDimension = ordered.isEmpty() ? Dimension.ORIGINAL : ordered.get(ordered.size() - 1); + this.transforms = defaultTransforms(); + this.transforms.putAll(builder.transforms); + this.stemmer = builder.stemmer; + this.lemmatizer = builder.lemmatizer; + this.tokenizer = builder.tokenizer; + } + + private static EnumMap defaultTransforms() { + final EnumMap map = new EnumMap<>(Dimension.class); + map.put(Dimension.NFC, NfcCharSequenceNormalizer.getInstance()); + map.put(Dimension.NFKC, NfkcCharSequenceNormalizer.getInstance()); + map.put(Dimension.WHITESPACE, WhitespaceCharSequenceNormalizer.getInstance()); + map.put(Dimension.DASH, DashCharSequenceNormalizer.getInstance()); + map.put(Dimension.CASE_FOLD, CaseFoldCharSequenceNormalizer.getInstance()); + map.put(Dimension.ACCENT_FOLD, AccentFoldCharSequenceNormalizer.getInstance()); + map.put(Dimension.CONFUSABLE_FOLD, ConfusableSkeletonCharSequenceNormalizer.getInstance()); + return map; + } + + /** + * {@return a new builder} + */ + public static Builder builder() { + return new Builder(); + } + + /** + * Segments {@code text} with the UAX #29 word tokenizer and returns one {@link Term} per + * word token, in order. The terms carry no part-of-speech tag, so {@link Dimension#LEMMA} is not + * available from them. + * + * @param text The text to analyze. + * @return The terms. + */ + public List analyze(CharSequence text) { + final List spans = tokenizer.tokenizeSpans(text); + final List terms = new ArrayList<>(spans.size()); + for (final Span span : spans) { + terms.add(new Term(this, span.getCoveredText(text).toString(), span, null)); + } + return terms; + } + + /** + * Returns one {@link Term} per supplied token, attaching the matching part-of-speech tag so that + * {@link Dimension#LEMMA} can be computed. The terms have no source span. + * + * @param tokens The tokens. + * @param tags The part-of-speech tag for each token; must be the same length as {@code tokens}. + * @return The terms. + * @throws IllegalArgumentException if {@code tokens} and {@code tags} differ in length. + */ + public List analyze(String[] tokens, String[] tags) { + if (tokens.length != tags.length) { + throw new IllegalArgumentException( + "tokens and tags must be the same length, got " + tokens.length + " and " + tags.length); + } + final List terms = new ArrayList<>(tokens.length); + for (int i = 0; i < tokens.length; i++) { + terms.add(new Term(this, tokens[i], null, tags[i])); + } + return terms; + } + + /** + * {@return the configured dimensions that are computed eagerly, in canonical order} The list + * never includes {@link Dimension#ORIGINAL}, which is always present. + */ + public List dimensions() { + return chain; + } + + Dimension finalDimension() { + return finalDimension; + } + + // Applies one dimension's transform to a single token value. Fails loudly when a token-level + // dimension was requested without the engine (or tag) it needs. + String apply(Dimension dimension, String input, String posTag) { + switch (dimension) { + case ORIGINAL: + return input; + case STEM: + if (stemmer == null) { + throw new IllegalStateException( + "Dimension STEM requires a Stemmer; configure it with builder().stem(...)"); + } + return stemmer.stem(input).toString(); + case LEMMA: + if (lemmatizer == null) { + throw new IllegalStateException( + "Dimension LEMMA requires a Lemmatizer; configure it with builder().lemmatize(...)"); + } + if (posTag == null) { + throw new IllegalStateException( + "Dimension LEMMA requires a part-of-speech tag; use analyze(tokens, tags)"); + } + return lemmatizer.lemmatize(new String[] {input}, new String[] {posTag})[0]; + default: + return transforms.get(dimension).normalize(input).toString(); + } + } + + /** A builder for {@link TermAnalyzer}. */ + public static final class Builder { + + private final EnumSet chain = EnumSet.noneOf(Dimension.class); + private final EnumMap transforms = + new EnumMap<>(Dimension.class); + private Stemmer stemmer; + private Lemmatizer lemmatizer; + private WordTokenizer tokenizer = new WordTokenizer(); + + private Builder() { + } + + /** + * Enables {@link Dimension#NFC}. + * + * @return this builder + */ + public Builder nfc() { + chain.add(Dimension.NFC); + return this; + } + + /** + * Enables {@link Dimension#NFKC}. + * + * @return this builder + */ + public Builder nfkc() { + chain.add(Dimension.NFKC); + return this; + } + + /** + * Enables {@link Dimension#WHITESPACE}. + * + * @return this builder + */ + public Builder whitespace() { + chain.add(Dimension.WHITESPACE); + return this; + } + + /** + * Enables {@link Dimension#WHITESPACE} with a specific normalizer, choosing the fold target and + * behavior. For a custom class and target use a {@link CharClass} method reference, for example + * {@code whitespace(CharClass.of(members, replacement)::collapse)}. + * + * @param normalizer The whitespace normalizer to use. + * @return this builder + */ + public Builder whitespace(CharSequenceNormalizer normalizer) { + return transform(Dimension.WHITESPACE, normalizer); + } + + /** + * Enables {@link Dimension#DASH}. + * + * @return this builder + */ + public Builder dashes() { + chain.add(Dimension.DASH); + return this; + } + + /** + * Enables {@link Dimension#DASH} with a specific normalizer (a custom dash set or target). + * + * @param normalizer The dash normalizer to use. + * @return this builder + */ + public Builder dashes(CharSequenceNormalizer normalizer) { + return transform(Dimension.DASH, normalizer); + } + + /** + * Enables {@link Dimension#CASE_FOLD}. + * + * @return this builder + */ + public Builder caseFold() { + chain.add(Dimension.CASE_FOLD); + return this; + } + + /** + * Enables {@link Dimension#CASE_FOLD} using the given locale's case rules (for example Turkish + * dotted/dotless i), instead of the default {@link Locale#ROOT}. + * + * @param locale The locale whose case rules to apply. + * @return this builder + */ + public Builder caseFold(Locale locale) { + return transform(Dimension.CASE_FOLD, CaseFoldCharSequenceNormalizer.getInstance(locale)); + } + + /** + * Enables {@link Dimension#ACCENT_FOLD}. + * + * @return this builder + */ + public Builder accentFold() { + chain.add(Dimension.ACCENT_FOLD); + return this; + } + + /** + * Enables {@link Dimension#ACCENT_FOLD} restricted to a specific set of scripts, instead of the + * default Latin/Greek/Cyrillic. + * + * @param foldScripts The scripts whose diacritics to fold. + * @param foldStrokeLetters Whether to also fold stroke letters such as o-slash and l-stroke. + * @return this builder + */ + public Builder accentFold(Set foldScripts, boolean foldStrokeLetters) { + return transform(Dimension.ACCENT_FOLD, + new AccentFoldCharSequenceNormalizer(foldScripts, foldStrokeLetters)); + } + + /** + * Enables {@link Dimension#CONFUSABLE_FOLD}. + * + * @return this builder + */ + public Builder confusableFold() { + chain.add(Dimension.CONFUSABLE_FOLD); + return this; + } + + /** + * Enables a character-level dimension with a specific normalizer, overriding its default (for + * example a locale-specific case fold for a language profile). + * + * @param dimension The character-level dimension to enable. + * @param normalizer The normalizer to use for it. + * @return this builder + * @throws IllegalArgumentException if {@code dimension} is {@link Dimension#ORIGINAL}, + * {@link Dimension#STEM}, or {@link Dimension#LEMMA}. + */ + public Builder transform(Dimension dimension, CharSequenceNormalizer normalizer) { + if (dimension == Dimension.ORIGINAL || dimension == Dimension.STEM + || dimension == Dimension.LEMMA) { + throw new IllegalArgumentException( + "transform(...) only applies to character-level dimensions, not " + dimension); + } + transforms.put(dimension, Objects.requireNonNull(normalizer, "normalizer")); + chain.add(dimension); + return this; + } + + /** + * Enables {@link Dimension#STEM} through the given stemmer. + * + * @param value The stemmer. + * @return this builder + */ + public Builder stem(Stemmer value) { + this.stemmer = Objects.requireNonNull(value, "stemmer"); + chain.add(Dimension.STEM); + return this; + } + + /** + * Enables {@link Dimension#LEMMA} through the given lemmatizer. + * + * @param value The lemmatizer. + * @return this builder + */ + public Builder lemmatize(Lemmatizer value) { + this.lemmatizer = Objects.requireNonNull(value, "lemmatizer"); + chain.add(Dimension.LEMMA); + return this; + } + + /** + * Sets the tokenizer used by {@link TermAnalyzer#analyze(CharSequence)}. + * + * @param value The tokenizer. + * @return this builder + */ + public Builder tokenizer(WordTokenizer value) { + this.tokenizer = Objects.requireNonNull(value, "tokenizer"); + return this; + } + + /** + * Sets the maximum token length of the tokenizer used by + * {@link TermAnalyzer#analyze(CharSequence)}. Convenience for + * {@code tokenizer(new WordTokenizer(maxTokenLength))}. + * + * @param maxTokenLength The maximum number of characters in a token. + * @return this builder + */ + public Builder maxTokenLength(int maxTokenLength) { + this.tokenizer = new WordTokenizer(maxTokenLength); + return this; + } + + /** + * {@return a new {@link TermAnalyzer} with this configuration} + */ + public TermAnalyzer build() { + return new TermAnalyzer(this); + } + } +} diff --git a/opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/tokenize/uax29/ExtendedPictographic.txt b/opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/tokenize/uax29/ExtendedPictographic.txt new file mode 100644 index 000000000..fcb66f495 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/tokenize/uax29/ExtendedPictographic.txt @@ -0,0 +1,461 @@ +# emoji-data.txt +# Date: 2025-07-25, 17:54:31 GMT +# © 2025 Unicode®, Inc. +# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries. +# For terms of use and license, see https://www.unicode.org/terms_of_use.html +# +# Emoji Data for UTS #51 +# Version: 17.0 +# +# For documentation and usage, see https://www.unicode.org/reports/tr51 +00A9 ; Extended_Pictographic# E0.6 [1] (©️) copyright +00AE ; Extended_Pictographic# E0.6 [1] (®️) registered +203C ; Extended_Pictographic# E0.6 [1] (‼️) double exclamation mark +2049 ; Extended_Pictographic# E0.6 [1] (⁉️) exclamation question mark +2122 ; Extended_Pictographic# E0.6 [1] (™️) trade mark +2139 ; Extended_Pictographic# E0.6 [1] (ℹ️) information +2194..2199 ; Extended_Pictographic# E0.6 [6] (↔️..↙️) left-right arrow..down-left arrow +21A9..21AA ; Extended_Pictographic# E0.6 [2] (↩️..↪️) right arrow curving left..left arrow curving right +231A..231B ; Extended_Pictographic# E0.6 [2] (⌚..⌛) watch..hourglass done +2328 ; Extended_Pictographic# E1.0 [1] (⌨️) keyboard +23CF ; Extended_Pictographic# E1.0 [1] (⏏️) eject button +23E9..23EC ; Extended_Pictographic# E0.6 [4] (⏩..⏬) fast-forward button..fast down button +23ED..23EE ; Extended_Pictographic# E0.7 [2] (⏭️..⏮️) next track button..last track button +23EF ; Extended_Pictographic# E1.0 [1] (⏯️) play or pause button +23F0 ; Extended_Pictographic# E0.6 [1] (⏰) alarm clock +23F1..23F2 ; Extended_Pictographic# E1.0 [2] (⏱️..⏲️) stopwatch..timer clock +23F3 ; Extended_Pictographic# E0.6 [1] (⏳) hourglass not done +23F8..23FA ; Extended_Pictographic# E0.7 [3] (⏸️..⏺️) pause button..record button +24C2 ; Extended_Pictographic# E0.6 [1] (Ⓜ️) circled M +25AA..25AB ; Extended_Pictographic# E0.6 [2] (▪️..▫️) black small square..white small square +25B6 ; Extended_Pictographic# E0.6 [1] (▶️) play button +25C0 ; Extended_Pictographic# E0.6 [1] (◀️) reverse button +25FB..25FE ; Extended_Pictographic# E0.6 [4] (◻️..◾) white medium square..black medium-small square +2600..2601 ; Extended_Pictographic# E0.6 [2] (☀️..☁️) sun..cloud +2602..2603 ; Extended_Pictographic# E0.7 [2] (☂️..☃️) umbrella..snowman +2604 ; Extended_Pictographic# E1.0 [1] (☄️) comet +260E ; Extended_Pictographic# E0.6 [1] (☎️) telephone +2611 ; Extended_Pictographic# E0.6 [1] (☑️) check box with check +2614..2615 ; Extended_Pictographic# E0.6 [2] (☔..☕) umbrella with rain drops..hot beverage +2618 ; Extended_Pictographic# E1.0 [1] (☘️) shamrock +261D ; Extended_Pictographic# E0.6 [1] (☝️) index pointing up +2620 ; Extended_Pictographic# E1.0 [1] (☠️) skull and crossbones +2622..2623 ; Extended_Pictographic# E1.0 [2] (☢️..☣️) radioactive..biohazard +2626 ; Extended_Pictographic# E1.0 [1] (☦️) orthodox cross +262A ; Extended_Pictographic# E0.7 [1] (☪️) star and crescent +262E ; Extended_Pictographic# E1.0 [1] (☮️) peace symbol +262F ; Extended_Pictographic# E0.7 [1] (☯️) yin yang +2638..2639 ; Extended_Pictographic# E0.7 [2] (☸️..☹️) wheel of dharma..frowning face +263A ; Extended_Pictographic# E0.6 [1] (☺️) smiling face +2640 ; Extended_Pictographic# E4.0 [1] (♀️) female sign +2642 ; Extended_Pictographic# E4.0 [1] (♂️) male sign +2648..2653 ; Extended_Pictographic# E0.6 [12] (♈..♓) Aries..Pisces +265F ; Extended_Pictographic# E11.0 [1] (♟️) chess pawn +2660 ; Extended_Pictographic# E0.6 [1] (♠️) spade suit +2663 ; Extended_Pictographic# E0.6 [1] (♣️) club suit +2665..2666 ; Extended_Pictographic# E0.6 [2] (♥️..♦️) heart suit..diamond suit +2668 ; Extended_Pictographic# E0.6 [1] (♨️) hot springs +267B ; Extended_Pictographic# E0.6 [1] (♻️) recycling symbol +267E ; Extended_Pictographic# E11.0 [1] (♾️) infinity +267F ; Extended_Pictographic# E0.6 [1] (♿) wheelchair symbol +2692 ; Extended_Pictographic# E1.0 [1] (⚒️) hammer and pick +2693 ; Extended_Pictographic# E0.6 [1] (⚓) anchor +2694 ; Extended_Pictographic# E1.0 [1] (⚔️) crossed swords +2695 ; Extended_Pictographic# E4.0 [1] (⚕️) medical symbol +2696..2697 ; Extended_Pictographic# E1.0 [2] (⚖️..⚗️) balance scale..alembic +2699 ; Extended_Pictographic# E1.0 [1] (⚙️) gear +269B..269C ; Extended_Pictographic# E1.0 [2] (⚛️..⚜️) atom symbol..fleur-de-lis +26A0..26A1 ; Extended_Pictographic# E0.6 [2] (⚠️..⚡) warning..high voltage +26A7 ; Extended_Pictographic# E13.0 [1] (⚧️) transgender symbol +26AA..26AB ; Extended_Pictographic# E0.6 [2] (⚪..⚫) white circle..black circle +26B0..26B1 ; Extended_Pictographic# E1.0 [2] (⚰️..⚱️) coffin..funeral urn +26BD..26BE ; Extended_Pictographic# E0.6 [2] (⚽..⚾) soccer ball..baseball +26C4..26C5 ; Extended_Pictographic# E0.6 [2] (⛄..⛅) snowman without snow..sun behind cloud +26C8 ; Extended_Pictographic# E0.7 [1] (⛈️) cloud with lightning and rain +26CE ; Extended_Pictographic# E0.6 [1] (⛎) Ophiuchus +26CF ; Extended_Pictographic# E0.7 [1] (⛏️) pick +26D1 ; Extended_Pictographic# E0.7 [1] (⛑️) rescue worker’s helmet +26D3 ; Extended_Pictographic# E0.7 [1] (⛓️) chains +26D4 ; Extended_Pictographic# E0.6 [1] (⛔) no entry +26E9 ; Extended_Pictographic# E0.7 [1] (⛩️) shinto shrine +26EA ; Extended_Pictographic# E0.6 [1] (⛪) church +26F0..26F1 ; Extended_Pictographic# E0.7 [2] (⛰️..⛱️) mountain..umbrella on ground +26F2..26F3 ; Extended_Pictographic# E0.6 [2] (⛲..⛳) fountain..flag in hole +26F4 ; Extended_Pictographic# E0.7 [1] (⛴️) ferry +26F5 ; Extended_Pictographic# E0.6 [1] (⛵) sailboat +26F7..26F9 ; Extended_Pictographic# E0.7 [3] (⛷️..⛹️) skier..person bouncing ball +26FA ; Extended_Pictographic# E0.6 [1] (⛺) tent +26FD ; Extended_Pictographic# E0.6 [1] (⛽) fuel pump +2702 ; Extended_Pictographic# E0.6 [1] (✂️) scissors +2705 ; Extended_Pictographic# E0.6 [1] (✅) check mark button +2708..270C ; Extended_Pictographic# E0.6 [5] (✈️..✌️) airplane..victory hand +270D ; Extended_Pictographic# E0.7 [1] (✍️) writing hand +270F ; Extended_Pictographic# E0.6 [1] (✏️) pencil +2712 ; Extended_Pictographic# E0.6 [1] (✒️) black nib +2714 ; Extended_Pictographic# E0.6 [1] (✔️) check mark +2716 ; Extended_Pictographic# E0.6 [1] (✖️) multiply +271D ; Extended_Pictographic# E0.7 [1] (✝️) latin cross +2721 ; Extended_Pictographic# E0.7 [1] (✡️) star of David +2728 ; Extended_Pictographic# E0.6 [1] (✨) sparkles +2733..2734 ; Extended_Pictographic# E0.6 [2] (✳️..✴️) eight-spoked asterisk..eight-pointed star +2744 ; Extended_Pictographic# E0.6 [1] (❄️) snowflake +2747 ; Extended_Pictographic# E0.6 [1] (❇️) sparkle +274C ; Extended_Pictographic# E0.6 [1] (❌) cross mark +274E ; Extended_Pictographic# E0.6 [1] (❎) cross mark button +2753..2755 ; Extended_Pictographic# E0.6 [3] (❓..❕) red question mark..white exclamation mark +2757 ; Extended_Pictographic# E0.6 [1] (❗) red exclamation mark +2763 ; Extended_Pictographic# E1.0 [1] (❣️) heart exclamation +2764 ; Extended_Pictographic# E0.6 [1] (❤️) red heart +2795..2797 ; Extended_Pictographic# E0.6 [3] (➕..➗) plus..divide +27A1 ; Extended_Pictographic# E0.6 [1] (➡️) right arrow +27B0 ; Extended_Pictographic# E0.6 [1] (➰) curly loop +27BF ; Extended_Pictographic# E1.0 [1] (➿) double curly loop +2934..2935 ; Extended_Pictographic# E0.6 [2] (⤴️..⤵️) right arrow curving up..right arrow curving down +2B05..2B07 ; Extended_Pictographic# E0.6 [3] (⬅️..⬇️) left arrow..down arrow +2B1B..2B1C ; Extended_Pictographic# E0.6 [2] (⬛..⬜) black large square..white large square +2B50 ; Extended_Pictographic# E0.6 [1] (⭐) star +2B55 ; Extended_Pictographic# E0.6 [1] (⭕) hollow red circle +3030 ; Extended_Pictographic# E0.6 [1] (〰️) wavy dash +303D ; Extended_Pictographic# E0.6 [1] (〽️) part alternation mark +3297 ; Extended_Pictographic# E0.6 [1] (㊗️) Japanese “congratulations” button +3299 ; Extended_Pictographic# E0.6 [1] (㊙️) Japanese “secret” button +1F004 ; Extended_Pictographic# E0.6 [1] (🀄) mahjong red dragon +1F02C..1F02F ; Extended_Pictographic# E0.0 [4] (🀬..🀯) .. +1F094..1F09F ; Extended_Pictographic# E0.0 [12] (🂔..🂟) .. +1F0AF..1F0B0 ; Extended_Pictographic# E0.0 [2] (🂯..🂰) .. +1F0C0 ; Extended_Pictographic# E0.0 [1] (🃀) +1F0CF ; Extended_Pictographic# E0.6 [1] (🃏) joker +1F0D0 ; Extended_Pictographic# E0.0 [1] (🃐) +1F0F6..1F0FF ; Extended_Pictographic# E0.0 [10] (🃶..🃿) .. +1F170..1F171 ; Extended_Pictographic# E0.6 [2] (🅰️..🅱️) A button (blood type)..B button (blood type) +1F17E..1F17F ; Extended_Pictographic# E0.6 [2] (🅾️..🅿️) O button (blood type)..P button +1F18E ; Extended_Pictographic# E0.6 [1] (🆎) AB button (blood type) +1F191..1F19A ; Extended_Pictographic# E0.6 [10] (🆑..🆚) CL button..VS button +1F1AE..1F1E5 ; Extended_Pictographic# E0.0 [56] (🆮..🇥) .. +1F201..1F202 ; Extended_Pictographic# E0.6 [2] (🈁..🈂️) Japanese “here” button..Japanese “service charge” button +1F203..1F20F ; Extended_Pictographic# E0.0 [13] (🈃..🈏) .. +1F21A ; Extended_Pictographic# E0.6 [1] (🈚) Japanese “free of charge” button +1F22F ; Extended_Pictographic# E0.6 [1] (🈯) Japanese “reserved” button +1F232..1F23A ; Extended_Pictographic# E0.6 [9] (🈲..🈺) Japanese “prohibited” button..Japanese “open for business” button +1F23C..1F23F ; Extended_Pictographic# E0.0 [4] (🈼..🈿) .. +1F249..1F24F ; Extended_Pictographic# E0.0 [7] (🉉..🉏) .. +1F250..1F251 ; Extended_Pictographic# E0.6 [2] (🉐..🉑) Japanese “bargain” button..Japanese “acceptable” button +1F252..1F25F ; Extended_Pictographic# E0.0 [14] (🉒..🉟) .. +1F266..1F2FF ; Extended_Pictographic# E0.0 [154] (🉦..🋿) .. +1F300..1F30C ; Extended_Pictographic# E0.6 [13] (🌀..🌌) cyclone..milky way +1F30D..1F30E ; Extended_Pictographic# E0.7 [2] (🌍..🌎) globe showing Europe-Africa..globe showing Americas +1F30F ; Extended_Pictographic# E0.6 [1] (🌏) globe showing Asia-Australia +1F310 ; Extended_Pictographic# E1.0 [1] (🌐) globe with meridians +1F311 ; Extended_Pictographic# E0.6 [1] (🌑) new moon +1F312 ; Extended_Pictographic# E1.0 [1] (🌒) waxing crescent moon +1F313..1F315 ; Extended_Pictographic# E0.6 [3] (🌓..🌕) first quarter moon..full moon +1F316..1F318 ; Extended_Pictographic# E1.0 [3] (🌖..🌘) waning gibbous moon..waning crescent moon +1F319 ; Extended_Pictographic# E0.6 [1] (🌙) crescent moon +1F31A ; Extended_Pictographic# E1.0 [1] (🌚) new moon face +1F31B ; Extended_Pictographic# E0.6 [1] (🌛) first quarter moon face +1F31C ; Extended_Pictographic# E0.7 [1] (🌜) last quarter moon face +1F31D..1F31E ; Extended_Pictographic# E1.0 [2] (🌝..🌞) full moon face..sun with face +1F31F..1F320 ; Extended_Pictographic# E0.6 [2] (🌟..🌠) glowing star..shooting star +1F321 ; Extended_Pictographic# E0.7 [1] (🌡️) thermometer +1F324..1F32C ; Extended_Pictographic# E0.7 [9] (🌤️..🌬️) sun behind small cloud..wind face +1F32D..1F32F ; Extended_Pictographic# E1.0 [3] (🌭..🌯) hot dog..burrito +1F330..1F331 ; Extended_Pictographic# E0.6 [2] (🌰..🌱) chestnut..seedling +1F332..1F333 ; Extended_Pictographic# E1.0 [2] (🌲..🌳) evergreen tree..deciduous tree +1F334..1F335 ; Extended_Pictographic# E0.6 [2] (🌴..🌵) palm tree..cactus +1F336 ; Extended_Pictographic# E0.7 [1] (🌶️) hot pepper +1F337..1F34A ; Extended_Pictographic# E0.6 [20] (🌷..🍊) tulip..tangerine +1F34B ; Extended_Pictographic# E1.0 [1] (🍋) lemon +1F34C..1F34F ; Extended_Pictographic# E0.6 [4] (🍌..🍏) banana..green apple +1F350 ; Extended_Pictographic# E1.0 [1] (🍐) pear +1F351..1F37B ; Extended_Pictographic# E0.6 [43] (🍑..🍻) peach..clinking beer mugs +1F37C ; Extended_Pictographic# E1.0 [1] (🍼) baby bottle +1F37D ; Extended_Pictographic# E0.7 [1] (🍽️) fork and knife with plate +1F37E..1F37F ; Extended_Pictographic# E1.0 [2] (🍾..🍿) bottle with popping cork..popcorn +1F380..1F393 ; Extended_Pictographic# E0.6 [20] (🎀..🎓) ribbon..graduation cap +1F396..1F397 ; Extended_Pictographic# E0.7 [2] (🎖️..🎗️) military medal..reminder ribbon +1F399..1F39B ; Extended_Pictographic# E0.7 [3] (🎙️..🎛️) studio microphone..control knobs +1F39E..1F39F ; Extended_Pictographic# E0.7 [2] (🎞️..🎟️) film frames..admission tickets +1F3A0..1F3C4 ; Extended_Pictographic# E0.6 [37] (🎠..🏄) carousel horse..person surfing +1F3C5 ; Extended_Pictographic# E1.0 [1] (🏅) sports medal +1F3C6 ; Extended_Pictographic# E0.6 [1] (🏆) trophy +1F3C7 ; Extended_Pictographic# E1.0 [1] (🏇) horse racing +1F3C8 ; Extended_Pictographic# E0.6 [1] (🏈) american football +1F3C9 ; Extended_Pictographic# E1.0 [1] (🏉) rugby football +1F3CA ; Extended_Pictographic# E0.6 [1] (🏊) person swimming +1F3CB..1F3CE ; Extended_Pictographic# E0.7 [4] (🏋️..🏎️) person lifting weights..racing car +1F3CF..1F3D3 ; Extended_Pictographic# E1.0 [5] (🏏..🏓) cricket game..ping pong +1F3D4..1F3DF ; Extended_Pictographic# E0.7 [12] (🏔️..🏟️) snow-capped mountain..stadium +1F3E0..1F3E3 ; Extended_Pictographic# E0.6 [4] (🏠..🏣) house..Japanese post office +1F3E4 ; Extended_Pictographic# E1.0 [1] (🏤) post office +1F3E5..1F3F0 ; Extended_Pictographic# E0.6 [12] (🏥..🏰) hospital..castle +1F3F3 ; Extended_Pictographic# E0.7 [1] (🏳️) white flag +1F3F4 ; Extended_Pictographic# E1.0 [1] (🏴) black flag +1F3F5 ; Extended_Pictographic# E0.7 [1] (🏵️) rosette +1F3F7 ; Extended_Pictographic# E0.7 [1] (🏷️) label +1F3F8..1F3FA ; Extended_Pictographic# E1.0 [3] (🏸..🏺) badminton..amphora +1F400..1F407 ; Extended_Pictographic# E1.0 [8] (🐀..🐇) rat..rabbit +1F408 ; Extended_Pictographic# E0.7 [1] (🐈) cat +1F409..1F40B ; Extended_Pictographic# E1.0 [3] (🐉..🐋) dragon..whale +1F40C..1F40E ; Extended_Pictographic# E0.6 [3] (🐌..🐎) snail..horse +1F40F..1F410 ; Extended_Pictographic# E1.0 [2] (🐏..🐐) ram..goat +1F411..1F412 ; Extended_Pictographic# E0.6 [2] (🐑..🐒) ewe..monkey +1F413 ; Extended_Pictographic# E1.0 [1] (🐓) rooster +1F414 ; Extended_Pictographic# E0.6 [1] (🐔) chicken +1F415 ; Extended_Pictographic# E0.7 [1] (🐕) dog +1F416 ; Extended_Pictographic# E1.0 [1] (🐖) pig +1F417..1F429 ; Extended_Pictographic# E0.6 [19] (🐗..🐩) boar..poodle +1F42A ; Extended_Pictographic# E1.0 [1] (🐪) camel +1F42B..1F43E ; Extended_Pictographic# E0.6 [20] (🐫..🐾) two-hump camel..paw prints +1F43F ; Extended_Pictographic# E0.7 [1] (🐿️) chipmunk +1F440 ; Extended_Pictographic# E0.6 [1] (👀) eyes +1F441 ; Extended_Pictographic# E0.7 [1] (👁️) eye +1F442..1F464 ; Extended_Pictographic# E0.6 [35] (👂..👤) ear..bust in silhouette +1F465 ; Extended_Pictographic# E1.0 [1] (👥) busts in silhouette +1F466..1F46B ; Extended_Pictographic# E0.6 [6] (👦..👫) boy..woman and man holding hands +1F46C..1F46D ; Extended_Pictographic# E1.0 [2] (👬..👭) men holding hands..women holding hands +1F46E..1F4AC ; Extended_Pictographic# E0.6 [63] (👮..💬) police officer..speech balloon +1F4AD ; Extended_Pictographic# E1.0 [1] (💭) thought balloon +1F4AE..1F4B5 ; Extended_Pictographic# E0.6 [8] (💮..💵) white flower..dollar banknote +1F4B6..1F4B7 ; Extended_Pictographic# E1.0 [2] (💶..💷) euro banknote..pound banknote +1F4B8..1F4EB ; Extended_Pictographic# E0.6 [52] (💸..📫) money with wings..closed mailbox with raised flag +1F4EC..1F4ED ; Extended_Pictographic# E0.7 [2] (📬..📭) open mailbox with raised flag..open mailbox with lowered flag +1F4EE ; Extended_Pictographic# E0.6 [1] (📮) postbox +1F4EF ; Extended_Pictographic# E1.0 [1] (📯) postal horn +1F4F0..1F4F4 ; Extended_Pictographic# E0.6 [5] (📰..📴) newspaper..mobile phone off +1F4F5 ; Extended_Pictographic# E1.0 [1] (📵) no mobile phones +1F4F6..1F4F7 ; Extended_Pictographic# E0.6 [2] (📶..📷) antenna bars..camera +1F4F8 ; Extended_Pictographic# E1.0 [1] (📸) camera with flash +1F4F9..1F4FC ; Extended_Pictographic# E0.6 [4] (📹..📼) video camera..videocassette +1F4FD ; Extended_Pictographic# E0.7 [1] (📽️) film projector +1F4FF..1F502 ; Extended_Pictographic# E1.0 [4] (📿..🔂) prayer beads..repeat single button +1F503 ; Extended_Pictographic# E0.6 [1] (🔃) clockwise vertical arrows +1F504..1F507 ; Extended_Pictographic# E1.0 [4] (🔄..🔇) counterclockwise arrows button..muted speaker +1F508 ; Extended_Pictographic# E0.7 [1] (🔈) speaker low volume +1F509 ; Extended_Pictographic# E1.0 [1] (🔉) speaker medium volume +1F50A..1F514 ; Extended_Pictographic# E0.6 [11] (🔊..🔔) speaker high volume..bell +1F515 ; Extended_Pictographic# E1.0 [1] (🔕) bell with slash +1F516..1F52B ; Extended_Pictographic# E0.6 [22] (🔖..🔫) bookmark..water pistol +1F52C..1F52D ; Extended_Pictographic# E1.0 [2] (🔬..🔭) microscope..telescope +1F52E..1F53D ; Extended_Pictographic# E0.6 [16] (🔮..🔽) crystal ball..downwards button +1F549..1F54A ; Extended_Pictographic# E0.7 [2] (🕉️..🕊️) om..dove +1F54B..1F54E ; Extended_Pictographic# E1.0 [4] (🕋..🕎) kaaba..menorah +1F550..1F55B ; Extended_Pictographic# E0.6 [12] (🕐..🕛) one o’clock..twelve o’clock +1F55C..1F567 ; Extended_Pictographic# E0.7 [12] (🕜..🕧) one-thirty..twelve-thirty +1F56F..1F570 ; Extended_Pictographic# E0.7 [2] (🕯️..🕰️) candle..mantelpiece clock +1F573..1F579 ; Extended_Pictographic# E0.7 [7] (🕳️..🕹️) hole..joystick +1F57A ; Extended_Pictographic# E3.0 [1] (🕺) man dancing +1F587 ; Extended_Pictographic# E0.7 [1] (🖇️) linked paperclips +1F58A..1F58D ; Extended_Pictographic# E0.7 [4] (🖊️..🖍️) pen..crayon +1F590 ; Extended_Pictographic# E0.7 [1] (🖐️) hand with fingers splayed +1F595..1F596 ; Extended_Pictographic# E1.0 [2] (🖕..🖖) middle finger..vulcan salute +1F5A4 ; Extended_Pictographic# E3.0 [1] (🖤) black heart +1F5A5 ; Extended_Pictographic# E0.7 [1] (🖥️) desktop computer +1F5A8 ; Extended_Pictographic# E0.7 [1] (🖨️) printer +1F5B1..1F5B2 ; Extended_Pictographic# E0.7 [2] (🖱️..🖲️) computer mouse..trackball +1F5BC ; Extended_Pictographic# E0.7 [1] (🖼️) framed picture +1F5C2..1F5C4 ; Extended_Pictographic# E0.7 [3] (🗂️..🗄️) card index dividers..file cabinet +1F5D1..1F5D3 ; Extended_Pictographic# E0.7 [3] (🗑️..🗓️) wastebasket..spiral calendar +1F5DC..1F5DE ; Extended_Pictographic# E0.7 [3] (🗜️..🗞️) clamp..rolled-up newspaper +1F5E1 ; Extended_Pictographic# E0.7 [1] (🗡️) dagger +1F5E3 ; Extended_Pictographic# E0.7 [1] (🗣️) speaking head +1F5E8 ; Extended_Pictographic# E2.0 [1] (🗨️) left speech bubble +1F5EF ; Extended_Pictographic# E0.7 [1] (🗯️) right anger bubble +1F5F3 ; Extended_Pictographic# E0.7 [1] (🗳️) ballot box with ballot +1F5FA ; Extended_Pictographic# E0.7 [1] (🗺️) world map +1F5FB..1F5FF ; Extended_Pictographic# E0.6 [5] (🗻..🗿) mount fuji..moai +1F600 ; Extended_Pictographic# E1.0 [1] (😀) grinning face +1F601..1F606 ; Extended_Pictographic# E0.6 [6] (😁..😆) beaming face with smiling eyes..grinning squinting face +1F607..1F608 ; Extended_Pictographic# E1.0 [2] (😇..😈) smiling face with halo..smiling face with horns +1F609..1F60D ; Extended_Pictographic# E0.6 [5] (😉..😍) winking face..smiling face with heart-eyes +1F60E ; Extended_Pictographic# E1.0 [1] (😎) smiling face with sunglasses +1F60F ; Extended_Pictographic# E0.6 [1] (😏) smirking face +1F610 ; Extended_Pictographic# E0.7 [1] (😐) neutral face +1F611 ; Extended_Pictographic# E1.0 [1] (😑) expressionless face +1F612..1F614 ; Extended_Pictographic# E0.6 [3] (😒..😔) unamused face..pensive face +1F615 ; Extended_Pictographic# E1.0 [1] (😕) confused face +1F616 ; Extended_Pictographic# E0.6 [1] (😖) confounded face +1F617 ; Extended_Pictographic# E1.0 [1] (😗) kissing face +1F618 ; Extended_Pictographic# E0.6 [1] (😘) face blowing a kiss +1F619 ; Extended_Pictographic# E1.0 [1] (😙) kissing face with smiling eyes +1F61A ; Extended_Pictographic# E0.6 [1] (😚) kissing face with closed eyes +1F61B ; Extended_Pictographic# E1.0 [1] (😛) face with tongue +1F61C..1F61E ; Extended_Pictographic# E0.6 [3] (😜..😞) winking face with tongue..disappointed face +1F61F ; Extended_Pictographic# E1.0 [1] (😟) worried face +1F620..1F625 ; Extended_Pictographic# E0.6 [6] (😠..😥) angry face..sad but relieved face +1F626..1F627 ; Extended_Pictographic# E1.0 [2] (😦..😧) frowning face with open mouth..anguished face +1F628..1F62B ; Extended_Pictographic# E0.6 [4] (😨..😫) fearful face..tired face +1F62C ; Extended_Pictographic# E1.0 [1] (😬) grimacing face +1F62D ; Extended_Pictographic# E0.6 [1] (😭) loudly crying face +1F62E..1F62F ; Extended_Pictographic# E1.0 [2] (😮..😯) face with open mouth..hushed face +1F630..1F633 ; Extended_Pictographic# E0.6 [4] (😰..😳) anxious face with sweat..flushed face +1F634 ; Extended_Pictographic# E1.0 [1] (😴) sleeping face +1F635 ; Extended_Pictographic# E0.6 [1] (😵) face with crossed-out eyes +1F636 ; Extended_Pictographic# E1.0 [1] (😶) face without mouth +1F637..1F640 ; Extended_Pictographic# E0.6 [10] (😷..🙀) face with medical mask..weary cat +1F641..1F644 ; Extended_Pictographic# E1.0 [4] (🙁..🙄) slightly frowning face..face with rolling eyes +1F645..1F64F ; Extended_Pictographic# E0.6 [11] (🙅..🙏) person gesturing NO..folded hands +1F680 ; Extended_Pictographic# E0.6 [1] (🚀) rocket +1F681..1F682 ; Extended_Pictographic# E1.0 [2] (🚁..🚂) helicopter..locomotive +1F683..1F685 ; Extended_Pictographic# E0.6 [3] (🚃..🚅) railway car..bullet train +1F686 ; Extended_Pictographic# E1.0 [1] (🚆) train +1F687 ; Extended_Pictographic# E0.6 [1] (🚇) metro +1F688 ; Extended_Pictographic# E1.0 [1] (🚈) light rail +1F689 ; Extended_Pictographic# E0.6 [1] (🚉) station +1F68A..1F68B ; Extended_Pictographic# E1.0 [2] (🚊..🚋) tram..tram car +1F68C ; Extended_Pictographic# E0.6 [1] (🚌) bus +1F68D ; Extended_Pictographic# E0.7 [1] (🚍) oncoming bus +1F68E ; Extended_Pictographic# E1.0 [1] (🚎) trolleybus +1F68F ; Extended_Pictographic# E0.6 [1] (🚏) bus stop +1F690 ; Extended_Pictographic# E1.0 [1] (🚐) minibus +1F691..1F693 ; Extended_Pictographic# E0.6 [3] (🚑..🚓) ambulance..police car +1F694 ; Extended_Pictographic# E0.7 [1] (🚔) oncoming police car +1F695 ; Extended_Pictographic# E0.6 [1] (🚕) taxi +1F696 ; Extended_Pictographic# E1.0 [1] (🚖) oncoming taxi +1F697 ; Extended_Pictographic# E0.6 [1] (🚗) automobile +1F698 ; Extended_Pictographic# E0.7 [1] (🚘) oncoming automobile +1F699..1F69A ; Extended_Pictographic# E0.6 [2] (🚙..🚚) sport utility vehicle..delivery truck +1F69B..1F6A1 ; Extended_Pictographic# E1.0 [7] (🚛..🚡) articulated lorry..aerial tramway +1F6A2 ; Extended_Pictographic# E0.6 [1] (🚢) ship +1F6A3 ; Extended_Pictographic# E1.0 [1] (🚣) person rowing boat +1F6A4..1F6A5 ; Extended_Pictographic# E0.6 [2] (🚤..🚥) speedboat..horizontal traffic light +1F6A6 ; Extended_Pictographic# E1.0 [1] (🚦) vertical traffic light +1F6A7..1F6AD ; Extended_Pictographic# E0.6 [7] (🚧..🚭) construction..no smoking +1F6AE..1F6B1 ; Extended_Pictographic# E1.0 [4] (🚮..🚱) litter in bin sign..non-potable water +1F6B2 ; Extended_Pictographic# E0.6 [1] (🚲) bicycle +1F6B3..1F6B5 ; Extended_Pictographic# E1.0 [3] (🚳..🚵) no bicycles..person mountain biking +1F6B6 ; Extended_Pictographic# E0.6 [1] (🚶) person walking +1F6B7..1F6B8 ; Extended_Pictographic# E1.0 [2] (🚷..🚸) no pedestrians..children crossing +1F6B9..1F6BE ; Extended_Pictographic# E0.6 [6] (🚹..🚾) men’s room..water closet +1F6BF ; Extended_Pictographic# E1.0 [1] (🚿) shower +1F6C0 ; Extended_Pictographic# E0.6 [1] (🛀) person taking bath +1F6C1..1F6C5 ; Extended_Pictographic# E1.0 [5] (🛁..🛅) bathtub..left luggage +1F6CB ; Extended_Pictographic# E0.7 [1] (🛋️) couch and lamp +1F6CC ; Extended_Pictographic# E1.0 [1] (🛌) person in bed +1F6CD..1F6CF ; Extended_Pictographic# E0.7 [3] (🛍️..🛏️) shopping bags..bed +1F6D0 ; Extended_Pictographic# E1.0 [1] (🛐) place of worship +1F6D1..1F6D2 ; Extended_Pictographic# E3.0 [2] (🛑..🛒) stop sign..shopping cart +1F6D5 ; Extended_Pictographic# E12.0 [1] (🛕) hindu temple +1F6D6..1F6D7 ; Extended_Pictographic# E13.0 [2] (🛖..🛗) hut..elevator +1F6D8 ; Extended_Pictographic# E17.0 [1] (🛘) landslide +1F6D9..1F6DB ; Extended_Pictographic# E0.0 [3] (🛙..🛛) .. +1F6DC ; Extended_Pictographic# E15.0 [1] (🛜) wireless +1F6DD..1F6DF ; Extended_Pictographic# E14.0 [3] (🛝..🛟) playground slide..ring buoy +1F6E0..1F6E5 ; Extended_Pictographic# E0.7 [6] (🛠️..🛥️) hammer and wrench..motor boat +1F6E9 ; Extended_Pictographic# E0.7 [1] (🛩️) small airplane +1F6EB..1F6EC ; Extended_Pictographic# E1.0 [2] (🛫..🛬) airplane departure..airplane arrival +1F6ED..1F6EF ; Extended_Pictographic# E0.0 [3] (🛭..🛯) .. +1F6F0 ; Extended_Pictographic# E0.7 [1] (🛰️) satellite +1F6F3 ; Extended_Pictographic# E0.7 [1] (🛳️) passenger ship +1F6F4..1F6F6 ; Extended_Pictographic# E3.0 [3] (🛴..🛶) kick scooter..canoe +1F6F7..1F6F8 ; Extended_Pictographic# E5.0 [2] (🛷..🛸) sled..flying saucer +1F6F9 ; Extended_Pictographic# E11.0 [1] (🛹) skateboard +1F6FA ; Extended_Pictographic# E12.0 [1] (🛺) auto rickshaw +1F6FB..1F6FC ; Extended_Pictographic# E13.0 [2] (🛻..🛼) pickup truck..roller skate +1F6FD..1F6FF ; Extended_Pictographic# E0.0 [3] (🛽..🛿) .. +1F7DA..1F7DF ; Extended_Pictographic# E0.0 [6] (🟚..🟟) .. +1F7E0..1F7EB ; Extended_Pictographic# E12.0 [12] (🟠..🟫) orange circle..brown square +1F7EC..1F7EF ; Extended_Pictographic# E0.0 [4] (🟬..🟯) .. +1F7F0 ; Extended_Pictographic# E14.0 [1] (🟰) heavy equals sign +1F7F1..1F7FF ; Extended_Pictographic# E0.0 [15] (🟱..🟿) .. +1F80C..1F80F ; Extended_Pictographic# E0.0 [4] (🠌..🠏) .. +1F848..1F84F ; Extended_Pictographic# E0.0 [8] (🡈..🡏) .. +1F85A..1F85F ; Extended_Pictographic# E0.0 [6] (🡚..🡟) .. +1F888..1F88F ; Extended_Pictographic# E0.0 [8] (🢈..🢏) .. +1F8AE..1F8AF ; Extended_Pictographic# E0.0 [2] (🢮..🢯) .. +1F8BC..1F8BF ; Extended_Pictographic# E0.0 [4] (🢼..🢿) .. +1F8C2..1F8CF ; Extended_Pictographic# E0.0 [14] (🣂..🣏) .. +1F8D9..1F8FF ; Extended_Pictographic# E0.0 [39] (🣙..🣿) .. +1F90C ; Extended_Pictographic# E13.0 [1] (🤌) pinched fingers +1F90D..1F90F ; Extended_Pictographic# E12.0 [3] (🤍..🤏) white heart..pinching hand +1F910..1F918 ; Extended_Pictographic# E1.0 [9] (🤐..🤘) zipper-mouth face..sign of the horns +1F919..1F91E ; Extended_Pictographic# E3.0 [6] (🤙..🤞) call me hand..crossed fingers +1F91F ; Extended_Pictographic# E5.0 [1] (🤟) love-you gesture +1F920..1F927 ; Extended_Pictographic# E3.0 [8] (🤠..🤧) cowboy hat face..sneezing face +1F928..1F92F ; Extended_Pictographic# E5.0 [8] (🤨..🤯) face with raised eyebrow..exploding head +1F930 ; Extended_Pictographic# E3.0 [1] (🤰) pregnant woman +1F931..1F932 ; Extended_Pictographic# E5.0 [2] (🤱..🤲) breast-feeding..palms up together +1F933..1F93A ; Extended_Pictographic# E3.0 [8] (🤳..🤺) selfie..person fencing +1F93C..1F93E ; Extended_Pictographic# E3.0 [3] (🤼..🤾) people wrestling..person playing handball +1F93F ; Extended_Pictographic# E12.0 [1] (🤿) diving mask +1F940..1F945 ; Extended_Pictographic# E3.0 [6] (🥀..🥅) wilted flower..goal net +1F947..1F94B ; Extended_Pictographic# E3.0 [5] (🥇..🥋) 1st place medal..martial arts uniform +1F94C ; Extended_Pictographic# E5.0 [1] (🥌) curling stone +1F94D..1F94F ; Extended_Pictographic# E11.0 [3] (🥍..🥏) lacrosse..flying disc +1F950..1F95E ; Extended_Pictographic# E3.0 [15] (🥐..🥞) croissant..pancakes +1F95F..1F96B ; Extended_Pictographic# E5.0 [13] (🥟..🥫) dumpling..canned food +1F96C..1F970 ; Extended_Pictographic# E11.0 [5] (🥬..🥰) leafy green..smiling face with hearts +1F971 ; Extended_Pictographic# E12.0 [1] (🥱) yawning face +1F972 ; Extended_Pictographic# E13.0 [1] (🥲) smiling face with tear +1F973..1F976 ; Extended_Pictographic# E11.0 [4] (🥳..🥶) partying face..cold face +1F977..1F978 ; Extended_Pictographic# E13.0 [2] (🥷..🥸) ninja..disguised face +1F979 ; Extended_Pictographic# E14.0 [1] (🥹) face holding back tears +1F97A ; Extended_Pictographic# E11.0 [1] (🥺) pleading face +1F97B ; Extended_Pictographic# E12.0 [1] (🥻) sari +1F97C..1F97F ; Extended_Pictographic# E11.0 [4] (🥼..🥿) lab coat..flat shoe +1F980..1F984 ; Extended_Pictographic# E1.0 [5] (🦀..🦄) crab..unicorn +1F985..1F991 ; Extended_Pictographic# E3.0 [13] (🦅..🦑) eagle..squid +1F992..1F997 ; Extended_Pictographic# E5.0 [6] (🦒..🦗) giraffe..cricket +1F998..1F9A2 ; Extended_Pictographic# E11.0 [11] (🦘..🦢) kangaroo..swan +1F9A3..1F9A4 ; Extended_Pictographic# E13.0 [2] (🦣..🦤) mammoth..dodo +1F9A5..1F9AA ; Extended_Pictographic# E12.0 [6] (🦥..🦪) sloth..oyster +1F9AB..1F9AD ; Extended_Pictographic# E13.0 [3] (🦫..🦭) beaver..seal +1F9AE..1F9AF ; Extended_Pictographic# E12.0 [2] (🦮..🦯) guide dog..white cane +1F9B0..1F9B9 ; Extended_Pictographic# E11.0 [10] (🦰..🦹) red hair..supervillain +1F9BA..1F9BF ; Extended_Pictographic# E12.0 [6] (🦺..🦿) safety vest..mechanical leg +1F9C0 ; Extended_Pictographic# E1.0 [1] (🧀) cheese wedge +1F9C1..1F9C2 ; Extended_Pictographic# E11.0 [2] (🧁..🧂) cupcake..salt +1F9C3..1F9CA ; Extended_Pictographic# E12.0 [8] (🧃..🧊) beverage box..ice +1F9CB ; Extended_Pictographic# E13.0 [1] (🧋) bubble tea +1F9CC ; Extended_Pictographic# E14.0 [1] (🧌) troll +1F9CD..1F9CF ; Extended_Pictographic# E12.0 [3] (🧍..🧏) person standing..deaf person +1F9D0..1F9E6 ; Extended_Pictographic# E5.0 [23] (🧐..🧦) face with monocle..socks +1F9E7..1F9FF ; Extended_Pictographic# E11.0 [25] (🧧..🧿) red envelope..nazar amulet +1FA58..1FA5F ; Extended_Pictographic# E0.0 [8] (🩘..🩟) .. +1FA6E..1FA6F ; Extended_Pictographic# E0.0 [2] (🩮..🩯) .. +1FA70..1FA73 ; Extended_Pictographic# E12.0 [4] (🩰..🩳) ballet shoes..shorts +1FA74 ; Extended_Pictographic# E13.0 [1] (🩴) thong sandal +1FA75..1FA77 ; Extended_Pictographic# E15.0 [3] (🩵..🩷) light blue heart..pink heart +1FA78..1FA7A ; Extended_Pictographic# E12.0 [3] (🩸..🩺) drop of blood..stethoscope +1FA7B..1FA7C ; Extended_Pictographic# E14.0 [2] (🩻..🩼) x-ray..crutch +1FA7D..1FA7F ; Extended_Pictographic# E0.0 [3] (🩽..🩿) .. +1FA80..1FA82 ; Extended_Pictographic# E12.0 [3] (🪀..🪂) yo-yo..parachute +1FA83..1FA86 ; Extended_Pictographic# E13.0 [4] (🪃..🪆) boomerang..nesting dolls +1FA87..1FA88 ; Extended_Pictographic# E15.0 [2] (🪇..🪈) maracas..flute +1FA89 ; Extended_Pictographic# E16.0 [1] (🪉) harp +1FA8A ; Extended_Pictographic# E17.0 [1] (🪊) trombone +1FA8B..1FA8D ; Extended_Pictographic# E0.0 [3] (🪋..🪍) .. +1FA8E ; Extended_Pictographic# E17.0 [1] (🪎) treasure chest +1FA8F ; Extended_Pictographic# E16.0 [1] (🪏) shovel +1FA90..1FA95 ; Extended_Pictographic# E12.0 [6] (🪐..🪕) ringed planet..banjo +1FA96..1FAA8 ; Extended_Pictographic# E13.0 [19] (🪖..🪨) military helmet..rock +1FAA9..1FAAC ; Extended_Pictographic# E14.0 [4] (🪩..🪬) mirror ball..hamsa +1FAAD..1FAAF ; Extended_Pictographic# E15.0 [3] (🪭..🪯) folding hand fan..khanda +1FAB0..1FAB6 ; Extended_Pictographic# E13.0 [7] (🪰..🪶) fly..feather +1FAB7..1FABA ; Extended_Pictographic# E14.0 [4] (🪷..🪺) lotus..nest with eggs +1FABB..1FABD ; Extended_Pictographic# E15.0 [3] (🪻..🪽) hyacinth..wing +1FABE ; Extended_Pictographic# E16.0 [1] (🪾) leafless tree +1FABF ; Extended_Pictographic# E15.0 [1] (🪿) goose +1FAC0..1FAC2 ; Extended_Pictographic# E13.0 [3] (🫀..🫂) anatomical heart..people hugging +1FAC3..1FAC5 ; Extended_Pictographic# E14.0 [3] (🫃..🫅) pregnant man..person with crown +1FAC6 ; Extended_Pictographic# E16.0 [1] (🫆) fingerprint +1FAC7 ; Extended_Pictographic# E0.0 [1] (🫇) +1FAC8 ; Extended_Pictographic# E17.0 [1] (🫈) hairy creature +1FAC9..1FACC ; Extended_Pictographic# E0.0 [4] (🫉..🫌) .. +1FACD ; Extended_Pictographic# E17.0 [1] (🫍) orca +1FACE..1FACF ; Extended_Pictographic# E15.0 [2] (🫎..🫏) moose..donkey +1FAD0..1FAD6 ; Extended_Pictographic# E13.0 [7] (🫐..🫖) blueberries..teapot +1FAD7..1FAD9 ; Extended_Pictographic# E14.0 [3] (🫗..🫙) pouring liquid..jar +1FADA..1FADB ; Extended_Pictographic# E15.0 [2] (🫚..🫛) ginger root..pea pod +1FADC ; Extended_Pictographic# E16.0 [1] (🫜) root vegetable +1FADD..1FADE ; Extended_Pictographic# E0.0 [2] (🫝..🫞) .. +1FADF ; Extended_Pictographic# E16.0 [1] (🫟) splatter +1FAE0..1FAE7 ; Extended_Pictographic# E14.0 [8] (🫠..🫧) melting face..bubbles +1FAE8 ; Extended_Pictographic# E15.0 [1] (🫨) shaking face +1FAE9 ; Extended_Pictographic# E16.0 [1] (🫩) face with bags under eyes +1FAEA ; Extended_Pictographic# E17.0 [1] (🫪) distorted face +1FAEB..1FAEE ; Extended_Pictographic# E0.0 [4] (🫫..🫮) .. +1FAEF ; Extended_Pictographic# E17.0 [1] (🫯) fight cloud +1FAF0..1FAF6 ; Extended_Pictographic# E14.0 [7] (🫰..🫶) hand with index finger and thumb crossed..heart hands +1FAF7..1FAF8 ; Extended_Pictographic# E15.0 [2] (🫷..🫸) leftwards pushing hand..rightwards pushing hand +1FAF9..1FAFF ; Extended_Pictographic# E0.0 [7] (🫹..🫿) .. +1FC00..1FFFD ; Extended_Pictographic# E0.0[1022] (🰀..🿽) .. diff --git a/opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/tokenize/uax29/WordBreakProperty.txt b/opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/tokenize/uax29/WordBreakProperty.txt new file mode 100644 index 000000000..20fa24e37 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/tokenize/uax29/WordBreakProperty.txt @@ -0,0 +1,1541 @@ +# WordBreakProperty-17.0.0.txt +# Date: 2025-06-30, 06:20:49 GMT +# © 2025 Unicode®, Inc. +# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries. +# For terms of use and license, see https://www.unicode.org/terms_of_use.html +# +# Unicode Character Database +# For documentation, see https://www.unicode.org/reports/tr44/ + +# ================================================ + +# Property: Word_Break + +# All code points not explicitly listed for Word_Break +# have the value Other (XX). + +# @missing: 0000..10FFFF; Other + +# ================================================ + +0022 ; Double_Quote # Po QUOTATION MARK + +# Total code points: 1 + +# ================================================ + +0027 ; Single_Quote # Po APOSTROPHE + +# Total code points: 1 + +# ================================================ + +05D0..05EA ; Hebrew_Letter # Lo [27] HEBREW LETTER ALEF..HEBREW LETTER TAV +05EF..05F2 ; Hebrew_Letter # Lo [4] HEBREW YOD TRIANGLE..HEBREW LIGATURE YIDDISH DOUBLE YOD +FB1D ; Hebrew_Letter # Lo HEBREW LETTER YOD WITH HIRIQ +FB1F..FB28 ; Hebrew_Letter # Lo [10] HEBREW LIGATURE YIDDISH YOD YOD PATAH..HEBREW LETTER WIDE TAV +FB2A..FB36 ; Hebrew_Letter # Lo [13] HEBREW LETTER SHIN WITH SHIN DOT..HEBREW LETTER ZAYIN WITH DAGESH +FB38..FB3C ; Hebrew_Letter # Lo [5] HEBREW LETTER TET WITH DAGESH..HEBREW LETTER LAMED WITH DAGESH +FB3E ; Hebrew_Letter # Lo HEBREW LETTER MEM WITH DAGESH +FB40..FB41 ; Hebrew_Letter # Lo [2] HEBREW LETTER NUN WITH DAGESH..HEBREW LETTER SAMEKH WITH DAGESH +FB43..FB44 ; Hebrew_Letter # Lo [2] HEBREW LETTER FINAL PE WITH DAGESH..HEBREW LETTER PE WITH DAGESH +FB46..FB4F ; Hebrew_Letter # Lo [10] HEBREW LETTER TSADI WITH DAGESH..HEBREW LIGATURE ALEF LAMED + +# Total code points: 75 + +# ================================================ + +000D ; CR # Cc + +# Total code points: 1 + +# ================================================ + +000A ; LF # Cc + +# Total code points: 1 + +# ================================================ + +000B..000C ; Newline # Cc [2] .. +0085 ; Newline # Cc +2028 ; Newline # Zl LINE SEPARATOR +2029 ; Newline # Zp PARAGRAPH SEPARATOR + +# Total code points: 5 + +# ================================================ + +0300..036F ; Extend # Mn [112] COMBINING GRAVE ACCENT..COMBINING LATIN SMALL LETTER X +0483..0487 ; Extend # Mn [5] COMBINING CYRILLIC TITLO..COMBINING CYRILLIC POKRYTIE +0488..0489 ; Extend # Me [2] COMBINING CYRILLIC HUNDRED THOUSANDS SIGN..COMBINING CYRILLIC MILLIONS SIGN +0591..05BD ; Extend # Mn [45] HEBREW ACCENT ETNAHTA..HEBREW POINT METEG +05BF ; Extend # Mn HEBREW POINT RAFE +05C1..05C2 ; Extend # Mn [2] HEBREW POINT SHIN DOT..HEBREW POINT SIN DOT +05C4..05C5 ; Extend # Mn [2] HEBREW MARK UPPER DOT..HEBREW MARK LOWER DOT +05C7 ; Extend # Mn HEBREW POINT QAMATS QATAN +0610..061A ; Extend # Mn [11] ARABIC SIGN SALLALLAHOU ALAYHE WASSALLAM..ARABIC SMALL KASRA +064B..065F ; Extend # Mn [21] ARABIC FATHATAN..ARABIC WAVY HAMZA BELOW +0670 ; Extend # Mn ARABIC LETTER SUPERSCRIPT ALEF +06D6..06DC ; Extend # Mn [7] ARABIC SMALL HIGH LIGATURE SAD WITH LAM WITH ALEF MAKSURA..ARABIC SMALL HIGH SEEN +06DF..06E4 ; Extend # Mn [6] ARABIC SMALL HIGH ROUNDED ZERO..ARABIC SMALL HIGH MADDA +06E7..06E8 ; Extend # Mn [2] ARABIC SMALL HIGH YEH..ARABIC SMALL HIGH NOON +06EA..06ED ; Extend # Mn [4] ARABIC EMPTY CENTRE LOW STOP..ARABIC SMALL LOW MEEM +0711 ; Extend # Mn SYRIAC LETTER SUPERSCRIPT ALAPH +0730..074A ; Extend # Mn [27] SYRIAC PTHAHA ABOVE..SYRIAC BARREKH +07A6..07B0 ; Extend # Mn [11] THAANA ABAFILI..THAANA SUKUN +07EB..07F3 ; Extend # Mn [9] NKO COMBINING SHORT HIGH TONE..NKO COMBINING DOUBLE DOT ABOVE +07FD ; Extend # Mn NKO DANTAYALAN +0816..0819 ; Extend # Mn [4] SAMARITAN MARK IN..SAMARITAN MARK DAGESH +081B..0823 ; Extend # Mn [9] SAMARITAN MARK EPENTHETIC YUT..SAMARITAN VOWEL SIGN A +0825..0827 ; Extend # Mn [3] SAMARITAN VOWEL SIGN SHORT A..SAMARITAN VOWEL SIGN U +0829..082D ; Extend # Mn [5] SAMARITAN VOWEL SIGN LONG I..SAMARITAN MARK NEQUDAA +0859..085B ; Extend # Mn [3] MANDAIC AFFRICATION MARK..MANDAIC GEMINATION MARK +0897..089F ; Extend # Mn [9] ARABIC PEPET..ARABIC HALF MADDA OVER MADDA +08CA..08E1 ; Extend # Mn [24] ARABIC SMALL HIGH FARSI YEH..ARABIC SMALL HIGH SIGN SAFHA +08E3..0902 ; Extend # Mn [32] ARABIC TURNED DAMMA BELOW..DEVANAGARI SIGN ANUSVARA +0903 ; Extend # Mc DEVANAGARI SIGN VISARGA +093A ; Extend # Mn DEVANAGARI VOWEL SIGN OE +093B ; Extend # Mc DEVANAGARI VOWEL SIGN OOE +093C ; Extend # Mn DEVANAGARI SIGN NUKTA +093E..0940 ; Extend # Mc [3] DEVANAGARI VOWEL SIGN AA..DEVANAGARI VOWEL SIGN II +0941..0948 ; Extend # Mn [8] DEVANAGARI VOWEL SIGN U..DEVANAGARI VOWEL SIGN AI +0949..094C ; Extend # Mc [4] DEVANAGARI VOWEL SIGN CANDRA O..DEVANAGARI VOWEL SIGN AU +094D ; Extend # Mn DEVANAGARI SIGN VIRAMA +094E..094F ; Extend # Mc [2] DEVANAGARI VOWEL SIGN PRISHTHAMATRA E..DEVANAGARI VOWEL SIGN AW +0951..0957 ; Extend # Mn [7] DEVANAGARI STRESS SIGN UDATTA..DEVANAGARI VOWEL SIGN UUE +0962..0963 ; Extend # Mn [2] DEVANAGARI VOWEL SIGN VOCALIC L..DEVANAGARI VOWEL SIGN VOCALIC LL +0981 ; Extend # Mn BENGALI SIGN CANDRABINDU +0982..0983 ; Extend # Mc [2] BENGALI SIGN ANUSVARA..BENGALI SIGN VISARGA +09BC ; Extend # Mn BENGALI SIGN NUKTA +09BE..09C0 ; Extend # Mc [3] BENGALI VOWEL SIGN AA..BENGALI VOWEL SIGN II +09C1..09C4 ; Extend # Mn [4] BENGALI VOWEL SIGN U..BENGALI VOWEL SIGN VOCALIC RR +09C7..09C8 ; Extend # Mc [2] BENGALI VOWEL SIGN E..BENGALI VOWEL SIGN AI +09CB..09CC ; Extend # Mc [2] BENGALI VOWEL SIGN O..BENGALI VOWEL SIGN AU +09CD ; Extend # Mn BENGALI SIGN VIRAMA +09D7 ; Extend # Mc BENGALI AU LENGTH MARK +09E2..09E3 ; Extend # Mn [2] BENGALI VOWEL SIGN VOCALIC L..BENGALI VOWEL SIGN VOCALIC LL +09FE ; Extend # Mn BENGALI SANDHI MARK +0A01..0A02 ; Extend # Mn [2] GURMUKHI SIGN ADAK BINDI..GURMUKHI SIGN BINDI +0A03 ; Extend # Mc GURMUKHI SIGN VISARGA +0A3C ; Extend # Mn GURMUKHI SIGN NUKTA +0A3E..0A40 ; Extend # Mc [3] GURMUKHI VOWEL SIGN AA..GURMUKHI VOWEL SIGN II +0A41..0A42 ; Extend # Mn [2] GURMUKHI VOWEL SIGN U..GURMUKHI VOWEL SIGN UU +0A47..0A48 ; Extend # Mn [2] GURMUKHI VOWEL SIGN EE..GURMUKHI VOWEL SIGN AI +0A4B..0A4D ; Extend # Mn [3] GURMUKHI VOWEL SIGN OO..GURMUKHI SIGN VIRAMA +0A51 ; Extend # Mn GURMUKHI SIGN UDAAT +0A70..0A71 ; Extend # Mn [2] GURMUKHI TIPPI..GURMUKHI ADDAK +0A75 ; Extend # Mn GURMUKHI SIGN YAKASH +0A81..0A82 ; Extend # Mn [2] GUJARATI SIGN CANDRABINDU..GUJARATI SIGN ANUSVARA +0A83 ; Extend # Mc GUJARATI SIGN VISARGA +0ABC ; Extend # Mn GUJARATI SIGN NUKTA +0ABE..0AC0 ; Extend # Mc [3] GUJARATI VOWEL SIGN AA..GUJARATI VOWEL SIGN II +0AC1..0AC5 ; Extend # Mn [5] GUJARATI VOWEL SIGN U..GUJARATI VOWEL SIGN CANDRA E +0AC7..0AC8 ; Extend # Mn [2] GUJARATI VOWEL SIGN E..GUJARATI VOWEL SIGN AI +0AC9 ; Extend # Mc GUJARATI VOWEL SIGN CANDRA O +0ACB..0ACC ; Extend # Mc [2] GUJARATI VOWEL SIGN O..GUJARATI VOWEL SIGN AU +0ACD ; Extend # Mn GUJARATI SIGN VIRAMA +0AE2..0AE3 ; Extend # Mn [2] GUJARATI VOWEL SIGN VOCALIC L..GUJARATI VOWEL SIGN VOCALIC LL +0AFA..0AFF ; Extend # Mn [6] GUJARATI SIGN SUKUN..GUJARATI SIGN TWO-CIRCLE NUKTA ABOVE +0B01 ; Extend # Mn ORIYA SIGN CANDRABINDU +0B02..0B03 ; Extend # Mc [2] ORIYA SIGN ANUSVARA..ORIYA SIGN VISARGA +0B3C ; Extend # Mn ORIYA SIGN NUKTA +0B3E ; Extend # Mc ORIYA VOWEL SIGN AA +0B3F ; Extend # Mn ORIYA VOWEL SIGN I +0B40 ; Extend # Mc ORIYA VOWEL SIGN II +0B41..0B44 ; Extend # Mn [4] ORIYA VOWEL SIGN U..ORIYA VOWEL SIGN VOCALIC RR +0B47..0B48 ; Extend # Mc [2] ORIYA VOWEL SIGN E..ORIYA VOWEL SIGN AI +0B4B..0B4C ; Extend # Mc [2] ORIYA VOWEL SIGN O..ORIYA VOWEL SIGN AU +0B4D ; Extend # Mn ORIYA SIGN VIRAMA +0B55..0B56 ; Extend # Mn [2] ORIYA SIGN OVERLINE..ORIYA AI LENGTH MARK +0B57 ; Extend # Mc ORIYA AU LENGTH MARK +0B62..0B63 ; Extend # Mn [2] ORIYA VOWEL SIGN VOCALIC L..ORIYA VOWEL SIGN VOCALIC LL +0B82 ; Extend # Mn TAMIL SIGN ANUSVARA +0BBE..0BBF ; Extend # Mc [2] TAMIL VOWEL SIGN AA..TAMIL VOWEL SIGN I +0BC0 ; Extend # Mn TAMIL VOWEL SIGN II +0BC1..0BC2 ; Extend # Mc [2] TAMIL VOWEL SIGN U..TAMIL VOWEL SIGN UU +0BC6..0BC8 ; Extend # Mc [3] TAMIL VOWEL SIGN E..TAMIL VOWEL SIGN AI +0BCA..0BCC ; Extend # Mc [3] TAMIL VOWEL SIGN O..TAMIL VOWEL SIGN AU +0BCD ; Extend # Mn TAMIL SIGN VIRAMA +0BD7 ; Extend # Mc TAMIL AU LENGTH MARK +0C00 ; Extend # Mn TELUGU SIGN COMBINING CANDRABINDU ABOVE +0C01..0C03 ; Extend # Mc [3] TELUGU SIGN CANDRABINDU..TELUGU SIGN VISARGA +0C04 ; Extend # Mn TELUGU SIGN COMBINING ANUSVARA ABOVE +0C3C ; Extend # Mn TELUGU SIGN NUKTA +0C3E..0C40 ; Extend # Mn [3] TELUGU VOWEL SIGN AA..TELUGU VOWEL SIGN II +0C41..0C44 ; Extend # Mc [4] TELUGU VOWEL SIGN U..TELUGU VOWEL SIGN VOCALIC RR +0C46..0C48 ; Extend # Mn [3] TELUGU VOWEL SIGN E..TELUGU VOWEL SIGN AI +0C4A..0C4D ; Extend # Mn [4] TELUGU VOWEL SIGN O..TELUGU SIGN VIRAMA +0C55..0C56 ; Extend # Mn [2] TELUGU LENGTH MARK..TELUGU AI LENGTH MARK +0C62..0C63 ; Extend # Mn [2] TELUGU VOWEL SIGN VOCALIC L..TELUGU VOWEL SIGN VOCALIC LL +0C81 ; Extend # Mn KANNADA SIGN CANDRABINDU +0C82..0C83 ; Extend # Mc [2] KANNADA SIGN ANUSVARA..KANNADA SIGN VISARGA +0CBC ; Extend # Mn KANNADA SIGN NUKTA +0CBE ; Extend # Mc KANNADA VOWEL SIGN AA +0CBF ; Extend # Mn KANNADA VOWEL SIGN I +0CC0..0CC4 ; Extend # Mc [5] KANNADA VOWEL SIGN II..KANNADA VOWEL SIGN VOCALIC RR +0CC6 ; Extend # Mn KANNADA VOWEL SIGN E +0CC7..0CC8 ; Extend # Mc [2] KANNADA VOWEL SIGN EE..KANNADA VOWEL SIGN AI +0CCA..0CCB ; Extend # Mc [2] KANNADA VOWEL SIGN O..KANNADA VOWEL SIGN OO +0CCC..0CCD ; Extend # Mn [2] KANNADA VOWEL SIGN AU..KANNADA SIGN VIRAMA +0CD5..0CD6 ; Extend # Mc [2] KANNADA LENGTH MARK..KANNADA AI LENGTH MARK +0CE2..0CE3 ; Extend # Mn [2] KANNADA VOWEL SIGN VOCALIC L..KANNADA VOWEL SIGN VOCALIC LL +0CF3 ; Extend # Mc KANNADA SIGN COMBINING ANUSVARA ABOVE RIGHT +0D00..0D01 ; Extend # Mn [2] MALAYALAM SIGN COMBINING ANUSVARA ABOVE..MALAYALAM SIGN CANDRABINDU +0D02..0D03 ; Extend # Mc [2] MALAYALAM SIGN ANUSVARA..MALAYALAM SIGN VISARGA +0D3B..0D3C ; Extend # Mn [2] MALAYALAM SIGN VERTICAL BAR VIRAMA..MALAYALAM SIGN CIRCULAR VIRAMA +0D3E..0D40 ; Extend # Mc [3] MALAYALAM VOWEL SIGN AA..MALAYALAM VOWEL SIGN II +0D41..0D44 ; Extend # Mn [4] MALAYALAM VOWEL SIGN U..MALAYALAM VOWEL SIGN VOCALIC RR +0D46..0D48 ; Extend # Mc [3] MALAYALAM VOWEL SIGN E..MALAYALAM VOWEL SIGN AI +0D4A..0D4C ; Extend # Mc [3] MALAYALAM VOWEL SIGN O..MALAYALAM VOWEL SIGN AU +0D4D ; Extend # Mn MALAYALAM SIGN VIRAMA +0D57 ; Extend # Mc MALAYALAM AU LENGTH MARK +0D62..0D63 ; Extend # Mn [2] MALAYALAM VOWEL SIGN VOCALIC L..MALAYALAM VOWEL SIGN VOCALIC LL +0D81 ; Extend # Mn SINHALA SIGN CANDRABINDU +0D82..0D83 ; Extend # Mc [2] SINHALA SIGN ANUSVARAYA..SINHALA SIGN VISARGAYA +0DCA ; Extend # Mn SINHALA SIGN AL-LAKUNA +0DCF..0DD1 ; Extend # Mc [3] SINHALA VOWEL SIGN AELA-PILLA..SINHALA VOWEL SIGN DIGA AEDA-PILLA +0DD2..0DD4 ; Extend # Mn [3] SINHALA VOWEL SIGN KETTI IS-PILLA..SINHALA VOWEL SIGN KETTI PAA-PILLA +0DD6 ; Extend # Mn SINHALA VOWEL SIGN DIGA PAA-PILLA +0DD8..0DDF ; Extend # Mc [8] SINHALA VOWEL SIGN GAETTA-PILLA..SINHALA VOWEL SIGN GAYANUKITTA +0DF2..0DF3 ; Extend # Mc [2] SINHALA VOWEL SIGN DIGA GAETTA-PILLA..SINHALA VOWEL SIGN DIGA GAYANUKITTA +0E31 ; Extend # Mn THAI CHARACTER MAI HAN-AKAT +0E34..0E3A ; Extend # Mn [7] THAI CHARACTER SARA I..THAI CHARACTER PHINTHU +0E47..0E4E ; Extend # Mn [8] THAI CHARACTER MAITAIKHU..THAI CHARACTER YAMAKKAN +0EB1 ; Extend # Mn LAO VOWEL SIGN MAI KAN +0EB4..0EBC ; Extend # Mn [9] LAO VOWEL SIGN I..LAO SEMIVOWEL SIGN LO +0EC8..0ECE ; Extend # Mn [7] LAO TONE MAI EK..LAO YAMAKKAN +0F18..0F19 ; Extend # Mn [2] TIBETAN ASTROLOGICAL SIGN -KHYUD PA..TIBETAN ASTROLOGICAL SIGN SDONG TSHUGS +0F35 ; Extend # Mn TIBETAN MARK NGAS BZUNG NYI ZLA +0F37 ; Extend # Mn TIBETAN MARK NGAS BZUNG SGOR RTAGS +0F39 ; Extend # Mn TIBETAN MARK TSA -PHRU +0F3E..0F3F ; Extend # Mc [2] TIBETAN SIGN YAR TSHES..TIBETAN SIGN MAR TSHES +0F71..0F7E ; Extend # Mn [14] TIBETAN VOWEL SIGN AA..TIBETAN SIGN RJES SU NGA RO +0F7F ; Extend # Mc TIBETAN SIGN RNAM BCAD +0F80..0F84 ; Extend # Mn [5] TIBETAN VOWEL SIGN REVERSED I..TIBETAN MARK HALANTA +0F86..0F87 ; Extend # Mn [2] TIBETAN SIGN LCI RTAGS..TIBETAN SIGN YANG RTAGS +0F8D..0F97 ; Extend # Mn [11] TIBETAN SUBJOINED SIGN LCE TSA CAN..TIBETAN SUBJOINED LETTER JA +0F99..0FBC ; Extend # Mn [36] TIBETAN SUBJOINED LETTER NYA..TIBETAN SUBJOINED LETTER FIXED-FORM RA +0FC6 ; Extend # Mn TIBETAN SYMBOL PADMA GDAN +102B..102C ; Extend # Mc [2] MYANMAR VOWEL SIGN TALL AA..MYANMAR VOWEL SIGN AA +102D..1030 ; Extend # Mn [4] MYANMAR VOWEL SIGN I..MYANMAR VOWEL SIGN UU +1031 ; Extend # Mc MYANMAR VOWEL SIGN E +1032..1037 ; Extend # Mn [6] MYANMAR VOWEL SIGN AI..MYANMAR SIGN DOT BELOW +1038 ; Extend # Mc MYANMAR SIGN VISARGA +1039..103A ; Extend # Mn [2] MYANMAR SIGN VIRAMA..MYANMAR SIGN ASAT +103B..103C ; Extend # Mc [2] MYANMAR CONSONANT SIGN MEDIAL YA..MYANMAR CONSONANT SIGN MEDIAL RA +103D..103E ; Extend # Mn [2] MYANMAR CONSONANT SIGN MEDIAL WA..MYANMAR CONSONANT SIGN MEDIAL HA +1056..1057 ; Extend # Mc [2] MYANMAR VOWEL SIGN VOCALIC R..MYANMAR VOWEL SIGN VOCALIC RR +1058..1059 ; Extend # Mn [2] MYANMAR VOWEL SIGN VOCALIC L..MYANMAR VOWEL SIGN VOCALIC LL +105E..1060 ; Extend # Mn [3] MYANMAR CONSONANT SIGN MON MEDIAL NA..MYANMAR CONSONANT SIGN MON MEDIAL LA +1062..1064 ; Extend # Mc [3] MYANMAR VOWEL SIGN SGAW KAREN EU..MYANMAR TONE MARK SGAW KAREN KE PHO +1067..106D ; Extend # Mc [7] MYANMAR VOWEL SIGN WESTERN PWO KAREN EU..MYANMAR SIGN WESTERN PWO KAREN TONE-5 +1071..1074 ; Extend # Mn [4] MYANMAR VOWEL SIGN GEBA KAREN I..MYANMAR VOWEL SIGN KAYAH EE +1082 ; Extend # Mn MYANMAR CONSONANT SIGN SHAN MEDIAL WA +1083..1084 ; Extend # Mc [2] MYANMAR VOWEL SIGN SHAN AA..MYANMAR VOWEL SIGN SHAN E +1085..1086 ; Extend # Mn [2] MYANMAR VOWEL SIGN SHAN E ABOVE..MYANMAR VOWEL SIGN SHAN FINAL Y +1087..108C ; Extend # Mc [6] MYANMAR SIGN SHAN TONE-2..MYANMAR SIGN SHAN COUNCIL TONE-3 +108D ; Extend # Mn MYANMAR SIGN SHAN COUNCIL EMPHATIC TONE +108F ; Extend # Mc MYANMAR SIGN RUMAI PALAUNG TONE-5 +109A..109C ; Extend # Mc [3] MYANMAR SIGN KHAMTI TONE-1..MYANMAR VOWEL SIGN AITON A +109D ; Extend # Mn MYANMAR VOWEL SIGN AITON AI +135D..135F ; Extend # Mn [3] ETHIOPIC COMBINING GEMINATION AND VOWEL LENGTH MARK..ETHIOPIC COMBINING GEMINATION MARK +1712..1714 ; Extend # Mn [3] TAGALOG VOWEL SIGN I..TAGALOG SIGN VIRAMA +1715 ; Extend # Mc TAGALOG SIGN PAMUDPOD +1732..1733 ; Extend # Mn [2] HANUNOO VOWEL SIGN I..HANUNOO VOWEL SIGN U +1734 ; Extend # Mc HANUNOO SIGN PAMUDPOD +1752..1753 ; Extend # Mn [2] BUHID VOWEL SIGN I..BUHID VOWEL SIGN U +1772..1773 ; Extend # Mn [2] TAGBANWA VOWEL SIGN I..TAGBANWA VOWEL SIGN U +17B4..17B5 ; Extend # Mn [2] KHMER VOWEL INHERENT AQ..KHMER VOWEL INHERENT AA +17B6 ; Extend # Mc KHMER VOWEL SIGN AA +17B7..17BD ; Extend # Mn [7] KHMER VOWEL SIGN I..KHMER VOWEL SIGN UA +17BE..17C5 ; Extend # Mc [8] KHMER VOWEL SIGN OE..KHMER VOWEL SIGN AU +17C6 ; Extend # Mn KHMER SIGN NIKAHIT +17C7..17C8 ; Extend # Mc [2] KHMER SIGN REAHMUK..KHMER SIGN YUUKALEAPINTU +17C9..17D3 ; Extend # Mn [11] KHMER SIGN MUUSIKATOAN..KHMER SIGN BATHAMASAT +17DD ; Extend # Mn KHMER SIGN ATTHACAN +180B..180D ; Extend # Mn [3] MONGOLIAN FREE VARIATION SELECTOR ONE..MONGOLIAN FREE VARIATION SELECTOR THREE +180F ; Extend # Mn MONGOLIAN FREE VARIATION SELECTOR FOUR +1885..1886 ; Extend # Mn [2] MONGOLIAN LETTER ALI GALI BALUDA..MONGOLIAN LETTER ALI GALI THREE BALUDA +18A9 ; Extend # Mn MONGOLIAN LETTER ALI GALI DAGALGA +1920..1922 ; Extend # Mn [3] LIMBU VOWEL SIGN A..LIMBU VOWEL SIGN U +1923..1926 ; Extend # Mc [4] LIMBU VOWEL SIGN EE..LIMBU VOWEL SIGN AU +1927..1928 ; Extend # Mn [2] LIMBU VOWEL SIGN E..LIMBU VOWEL SIGN O +1929..192B ; Extend # Mc [3] LIMBU SUBJOINED LETTER YA..LIMBU SUBJOINED LETTER WA +1930..1931 ; Extend # Mc [2] LIMBU SMALL LETTER KA..LIMBU SMALL LETTER NGA +1932 ; Extend # Mn LIMBU SMALL LETTER ANUSVARA +1933..1938 ; Extend # Mc [6] LIMBU SMALL LETTER TA..LIMBU SMALL LETTER LA +1939..193B ; Extend # Mn [3] LIMBU SIGN MUKPHRENG..LIMBU SIGN SA-I +1A17..1A18 ; Extend # Mn [2] BUGINESE VOWEL SIGN I..BUGINESE VOWEL SIGN U +1A19..1A1A ; Extend # Mc [2] BUGINESE VOWEL SIGN E..BUGINESE VOWEL SIGN O +1A1B ; Extend # Mn BUGINESE VOWEL SIGN AE +1A55 ; Extend # Mc TAI THAM CONSONANT SIGN MEDIAL RA +1A56 ; Extend # Mn TAI THAM CONSONANT SIGN MEDIAL LA +1A57 ; Extend # Mc TAI THAM CONSONANT SIGN LA TANG LAI +1A58..1A5E ; Extend # Mn [7] TAI THAM SIGN MAI KANG LAI..TAI THAM CONSONANT SIGN SA +1A60 ; Extend # Mn TAI THAM SIGN SAKOT +1A61 ; Extend # Mc TAI THAM VOWEL SIGN A +1A62 ; Extend # Mn TAI THAM VOWEL SIGN MAI SAT +1A63..1A64 ; Extend # Mc [2] TAI THAM VOWEL SIGN AA..TAI THAM VOWEL SIGN TALL AA +1A65..1A6C ; Extend # Mn [8] TAI THAM VOWEL SIGN I..TAI THAM VOWEL SIGN OA BELOW +1A6D..1A72 ; Extend # Mc [6] TAI THAM VOWEL SIGN OY..TAI THAM VOWEL SIGN THAM AI +1A73..1A7C ; Extend # Mn [10] TAI THAM VOWEL SIGN OA ABOVE..TAI THAM SIGN KHUEN-LUE KARAN +1A7F ; Extend # Mn TAI THAM COMBINING CRYPTOGRAMMIC DOT +1AB0..1ABD ; Extend # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW +1ABE ; Extend # Me COMBINING PARENTHESES OVERLAY +1ABF..1ADD ; Extend # Mn [31] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOT-AND-RING BELOW +1AE0..1AEB ; Extend # Mn [12] COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE +1B00..1B03 ; Extend # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG +1B04 ; Extend # Mc BALINESE SIGN BISAH +1B34 ; Extend # Mn BALINESE SIGN REREKAN +1B35 ; Extend # Mc BALINESE VOWEL SIGN TEDUNG +1B36..1B3A ; Extend # Mn [5] BALINESE VOWEL SIGN ULU..BALINESE VOWEL SIGN RA REPA +1B3B ; Extend # Mc BALINESE VOWEL SIGN RA REPA TEDUNG +1B3C ; Extend # Mn BALINESE VOWEL SIGN LA LENGA +1B3D..1B41 ; Extend # Mc [5] BALINESE VOWEL SIGN LA LENGA TEDUNG..BALINESE VOWEL SIGN TALING REPA TEDUNG +1B42 ; Extend # Mn BALINESE VOWEL SIGN PEPET +1B43..1B44 ; Extend # Mc [2] BALINESE VOWEL SIGN PEPET TEDUNG..BALINESE ADEG ADEG +1B6B..1B73 ; Extend # Mn [9] BALINESE MUSICAL SYMBOL COMBINING TEGEH..BALINESE MUSICAL SYMBOL COMBINING GONG +1B80..1B81 ; Extend # Mn [2] SUNDANESE SIGN PANYECEK..SUNDANESE SIGN PANGLAYAR +1B82 ; Extend # Mc SUNDANESE SIGN PANGWISAD +1BA1 ; Extend # Mc SUNDANESE CONSONANT SIGN PAMINGKAL +1BA2..1BA5 ; Extend # Mn [4] SUNDANESE CONSONANT SIGN PANYAKRA..SUNDANESE VOWEL SIGN PANYUKU +1BA6..1BA7 ; Extend # Mc [2] SUNDANESE VOWEL SIGN PANAELAENG..SUNDANESE VOWEL SIGN PANOLONG +1BA8..1BA9 ; Extend # Mn [2] SUNDANESE VOWEL SIGN PAMEPET..SUNDANESE VOWEL SIGN PANEULEUNG +1BAA ; Extend # Mc SUNDANESE SIGN PAMAAEH +1BAB..1BAD ; Extend # Mn [3] SUNDANESE SIGN VIRAMA..SUNDANESE CONSONANT SIGN PASANGAN WA +1BE6 ; Extend # Mn BATAK SIGN TOMPI +1BE7 ; Extend # Mc BATAK VOWEL SIGN E +1BE8..1BE9 ; Extend # Mn [2] BATAK VOWEL SIGN PAKPAK E..BATAK VOWEL SIGN EE +1BEA..1BEC ; Extend # Mc [3] BATAK VOWEL SIGN I..BATAK VOWEL SIGN O +1BED ; Extend # Mn BATAK VOWEL SIGN KARO O +1BEE ; Extend # Mc BATAK VOWEL SIGN U +1BEF..1BF1 ; Extend # Mn [3] BATAK VOWEL SIGN U FOR SIMALUNGUN SA..BATAK CONSONANT SIGN H +1BF2..1BF3 ; Extend # Mc [2] BATAK PANGOLAT..BATAK PANONGONAN +1C24..1C2B ; Extend # Mc [8] LEPCHA SUBJOINED LETTER YA..LEPCHA VOWEL SIGN UU +1C2C..1C33 ; Extend # Mn [8] LEPCHA VOWEL SIGN E..LEPCHA CONSONANT SIGN T +1C34..1C35 ; Extend # Mc [2] LEPCHA CONSONANT SIGN NYIN-DO..LEPCHA CONSONANT SIGN KANG +1C36..1C37 ; Extend # Mn [2] LEPCHA SIGN RAN..LEPCHA SIGN NUKTA +1CD0..1CD2 ; Extend # Mn [3] VEDIC TONE KARSHANA..VEDIC TONE PRENKHA +1CD4..1CE0 ; Extend # Mn [13] VEDIC SIGN YAJURVEDIC MIDLINE SVARITA..VEDIC TONE RIGVEDIC KASHMIRI INDEPENDENT SVARITA +1CE1 ; Extend # Mc VEDIC TONE ATHARVAVEDIC INDEPENDENT SVARITA +1CE2..1CE8 ; Extend # Mn [7] VEDIC SIGN VISARGA SVARITA..VEDIC SIGN VISARGA ANUDATTA WITH TAIL +1CED ; Extend # Mn VEDIC SIGN TIRYAK +1CF4 ; Extend # Mn VEDIC TONE CANDRA ABOVE +1CF7 ; Extend # Mc VEDIC SIGN ATIKRAMA +1CF8..1CF9 ; Extend # Mn [2] VEDIC TONE RING ABOVE..VEDIC TONE DOUBLE RING ABOVE +1DC0..1DFF ; Extend # Mn [64] COMBINING DOTTED GRAVE ACCENT..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW +200C ; Extend # Cf ZERO WIDTH NON-JOINER +20D0..20DC ; Extend # Mn [13] COMBINING LEFT HARPOON ABOVE..COMBINING FOUR DOTS ABOVE +20DD..20E0 ; Extend # Me [4] COMBINING ENCLOSING CIRCLE..COMBINING ENCLOSING CIRCLE BACKSLASH +20E1 ; Extend # Mn COMBINING LEFT RIGHT ARROW ABOVE +20E2..20E4 ; Extend # Me [3] COMBINING ENCLOSING SCREEN..COMBINING ENCLOSING UPWARD POINTING TRIANGLE +20E5..20F0 ; Extend # Mn [12] COMBINING REVERSE SOLIDUS OVERLAY..COMBINING ASTERISK ABOVE +2CEF..2CF1 ; Extend # Mn [3] COPTIC COMBINING NI ABOVE..COPTIC COMBINING SPIRITUS LENIS +2D7F ; Extend # Mn TIFINAGH CONSONANT JOINER +2DE0..2DFF ; Extend # Mn [32] COMBINING CYRILLIC LETTER BE..COMBINING CYRILLIC LETTER IOTIFIED BIG YUS +302A..302D ; Extend # Mn [4] IDEOGRAPHIC LEVEL TONE MARK..IDEOGRAPHIC ENTERING TONE MARK +302E..302F ; Extend # Mc [2] HANGUL SINGLE DOT TONE MARK..HANGUL DOUBLE DOT TONE MARK +3099..309A ; Extend # Mn [2] COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK..COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK +A66F ; Extend # Mn COMBINING CYRILLIC VZMET +A670..A672 ; Extend # Me [3] COMBINING CYRILLIC TEN MILLIONS SIGN..COMBINING CYRILLIC THOUSAND MILLIONS SIGN +A674..A67D ; Extend # Mn [10] COMBINING CYRILLIC LETTER UKRAINIAN IE..COMBINING CYRILLIC PAYEROK +A69E..A69F ; Extend # Mn [2] COMBINING CYRILLIC LETTER EF..COMBINING CYRILLIC LETTER IOTIFIED E +A6F0..A6F1 ; Extend # Mn [2] BAMUM COMBINING MARK KOQNDON..BAMUM COMBINING MARK TUKWENTIS +A802 ; Extend # Mn SYLOTI NAGRI SIGN DVISVARA +A806 ; Extend # Mn SYLOTI NAGRI SIGN HASANTA +A80B ; Extend # Mn SYLOTI NAGRI SIGN ANUSVARA +A823..A824 ; Extend # Mc [2] SYLOTI NAGRI VOWEL SIGN A..SYLOTI NAGRI VOWEL SIGN I +A825..A826 ; Extend # Mn [2] SYLOTI NAGRI VOWEL SIGN U..SYLOTI NAGRI VOWEL SIGN E +A827 ; Extend # Mc SYLOTI NAGRI VOWEL SIGN OO +A82C ; Extend # Mn SYLOTI NAGRI SIGN ALTERNATE HASANTA +A880..A881 ; Extend # Mc [2] SAURASHTRA SIGN ANUSVARA..SAURASHTRA SIGN VISARGA +A8B4..A8C3 ; Extend # Mc [16] SAURASHTRA CONSONANT SIGN HAARU..SAURASHTRA VOWEL SIGN AU +A8C4..A8C5 ; Extend # Mn [2] SAURASHTRA SIGN VIRAMA..SAURASHTRA SIGN CANDRABINDU +A8E0..A8F1 ; Extend # Mn [18] COMBINING DEVANAGARI DIGIT ZERO..COMBINING DEVANAGARI SIGN AVAGRAHA +A8FF ; Extend # Mn DEVANAGARI VOWEL SIGN AY +A926..A92D ; Extend # Mn [8] KAYAH LI VOWEL UE..KAYAH LI TONE CALYA PLOPHU +A947..A951 ; Extend # Mn [11] REJANG VOWEL SIGN I..REJANG CONSONANT SIGN R +A952..A953 ; Extend # Mc [2] REJANG CONSONANT SIGN H..REJANG VIRAMA +A980..A982 ; Extend # Mn [3] JAVANESE SIGN PANYANGGA..JAVANESE SIGN LAYAR +A983 ; Extend # Mc JAVANESE SIGN WIGNYAN +A9B3 ; Extend # Mn JAVANESE SIGN CECAK TELU +A9B4..A9B5 ; Extend # Mc [2] JAVANESE VOWEL SIGN TARUNG..JAVANESE VOWEL SIGN TOLONG +A9B6..A9B9 ; Extend # Mn [4] JAVANESE VOWEL SIGN WULU..JAVANESE VOWEL SIGN SUKU MENDUT +A9BA..A9BB ; Extend # Mc [2] JAVANESE VOWEL SIGN TALING..JAVANESE VOWEL SIGN DIRGA MURE +A9BC..A9BD ; Extend # Mn [2] JAVANESE VOWEL SIGN PEPET..JAVANESE CONSONANT SIGN KERET +A9BE..A9C0 ; Extend # Mc [3] JAVANESE CONSONANT SIGN PENGKAL..JAVANESE PANGKON +A9E5 ; Extend # Mn MYANMAR SIGN SHAN SAW +AA29..AA2E ; Extend # Mn [6] CHAM VOWEL SIGN AA..CHAM VOWEL SIGN OE +AA2F..AA30 ; Extend # Mc [2] CHAM VOWEL SIGN O..CHAM VOWEL SIGN AI +AA31..AA32 ; Extend # Mn [2] CHAM VOWEL SIGN AU..CHAM VOWEL SIGN UE +AA33..AA34 ; Extend # Mc [2] CHAM CONSONANT SIGN YA..CHAM CONSONANT SIGN RA +AA35..AA36 ; Extend # Mn [2] CHAM CONSONANT SIGN LA..CHAM CONSONANT SIGN WA +AA43 ; Extend # Mn CHAM CONSONANT SIGN FINAL NG +AA4C ; Extend # Mn CHAM CONSONANT SIGN FINAL M +AA4D ; Extend # Mc CHAM CONSONANT SIGN FINAL H +AA7B ; Extend # Mc MYANMAR SIGN PAO KAREN TONE +AA7C ; Extend # Mn MYANMAR SIGN TAI LAING TONE-2 +AA7D ; Extend # Mc MYANMAR SIGN TAI LAING TONE-5 +AAB0 ; Extend # Mn TAI VIET MAI KANG +AAB2..AAB4 ; Extend # Mn [3] TAI VIET VOWEL I..TAI VIET VOWEL U +AAB7..AAB8 ; Extend # Mn [2] TAI VIET MAI KHIT..TAI VIET VOWEL IA +AABE..AABF ; Extend # Mn [2] TAI VIET VOWEL AM..TAI VIET TONE MAI EK +AAC1 ; Extend # Mn TAI VIET TONE MAI THO +AAEB ; Extend # Mc MEETEI MAYEK VOWEL SIGN II +AAEC..AAED ; Extend # Mn [2] MEETEI MAYEK VOWEL SIGN UU..MEETEI MAYEK VOWEL SIGN AAI +AAEE..AAEF ; Extend # Mc [2] MEETEI MAYEK VOWEL SIGN AU..MEETEI MAYEK VOWEL SIGN AAU +AAF5 ; Extend # Mc MEETEI MAYEK VOWEL SIGN VISARGA +AAF6 ; Extend # Mn MEETEI MAYEK VIRAMA +ABE3..ABE4 ; Extend # Mc [2] MEETEI MAYEK VOWEL SIGN ONAP..MEETEI MAYEK VOWEL SIGN INAP +ABE5 ; Extend # Mn MEETEI MAYEK VOWEL SIGN ANAP +ABE6..ABE7 ; Extend # Mc [2] MEETEI MAYEK VOWEL SIGN YENAP..MEETEI MAYEK VOWEL SIGN SOUNAP +ABE8 ; Extend # Mn MEETEI MAYEK VOWEL SIGN UNAP +ABE9..ABEA ; Extend # Mc [2] MEETEI MAYEK VOWEL SIGN CHEINAP..MEETEI MAYEK VOWEL SIGN NUNG +ABEC ; Extend # Mc MEETEI MAYEK LUM IYEK +ABED ; Extend # Mn MEETEI MAYEK APUN IYEK +FB1E ; Extend # Mn HEBREW POINT JUDEO-SPANISH VARIKA +FE00..FE0F ; Extend # Mn [16] VARIATION SELECTOR-1..VARIATION SELECTOR-16 +FE20..FE2F ; Extend # Mn [16] COMBINING LIGATURE LEFT HALF..COMBINING CYRILLIC TITLO RIGHT HALF +FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK +101FD ; Extend # Mn PHAISTOS DISC SIGN COMBINING OBLIQUE STROKE +102E0 ; Extend # Mn COPTIC EPACT THOUSANDS MARK +10376..1037A ; Extend # Mn [5] COMBINING OLD PERMIC LETTER AN..COMBINING OLD PERMIC LETTER SII +10A01..10A03 ; Extend # Mn [3] KHAROSHTHI VOWEL SIGN I..KHAROSHTHI VOWEL SIGN VOCALIC R +10A05..10A06 ; Extend # Mn [2] KHAROSHTHI VOWEL SIGN E..KHAROSHTHI VOWEL SIGN O +10A0C..10A0F ; Extend # Mn [4] KHAROSHTHI VOWEL LENGTH MARK..KHAROSHTHI SIGN VISARGA +10A38..10A3A ; Extend # Mn [3] KHAROSHTHI SIGN BAR ABOVE..KHAROSHTHI SIGN DOT BELOW +10A3F ; Extend # Mn KHAROSHTHI VIRAMA +10AE5..10AE6 ; Extend # Mn [2] MANICHAEAN ABBREVIATION MARK ABOVE..MANICHAEAN ABBREVIATION MARK BELOW +10D24..10D27 ; Extend # Mn [4] HANIFI ROHINGYA SIGN HARBAHAY..HANIFI ROHINGYA SIGN TASSI +10D69..10D6D ; Extend # Mn [5] GARAY VOWEL SIGN E..GARAY CONSONANT NASALIZATION MARK +10EAB..10EAC ; Extend # Mn [2] YEZIDI COMBINING HAMZA MARK..YEZIDI COMBINING MADDA MARK +10EFA..10EFF ; Extend # Mn [6] ARABIC DOUBLE VERTICAL BAR BELOW..ARABIC SMALL LOW WORD MADDA +10F46..10F50 ; Extend # Mn [11] SOGDIAN COMBINING DOT BELOW..SOGDIAN COMBINING STROKE BELOW +10F82..10F85 ; Extend # Mn [4] OLD UYGHUR COMBINING DOT ABOVE..OLD UYGHUR COMBINING TWO DOTS BELOW +11000 ; Extend # Mc BRAHMI SIGN CANDRABINDU +11001 ; Extend # Mn BRAHMI SIGN ANUSVARA +11002 ; Extend # Mc BRAHMI SIGN VISARGA +11038..11046 ; Extend # Mn [15] BRAHMI VOWEL SIGN AA..BRAHMI VIRAMA +11070 ; Extend # Mn BRAHMI SIGN OLD TAMIL VIRAMA +11073..11074 ; Extend # Mn [2] BRAHMI VOWEL SIGN OLD TAMIL SHORT E..BRAHMI VOWEL SIGN OLD TAMIL SHORT O +1107F..11081 ; Extend # Mn [3] BRAHMI NUMBER JOINER..KAITHI SIGN ANUSVARA +11082 ; Extend # Mc KAITHI SIGN VISARGA +110B0..110B2 ; Extend # Mc [3] KAITHI VOWEL SIGN AA..KAITHI VOWEL SIGN II +110B3..110B6 ; Extend # Mn [4] KAITHI VOWEL SIGN U..KAITHI VOWEL SIGN AI +110B7..110B8 ; Extend # Mc [2] KAITHI VOWEL SIGN O..KAITHI VOWEL SIGN AU +110B9..110BA ; Extend # Mn [2] KAITHI SIGN VIRAMA..KAITHI SIGN NUKTA +110C2 ; Extend # Mn KAITHI VOWEL SIGN VOCALIC R +11100..11102 ; Extend # Mn [3] CHAKMA SIGN CANDRABINDU..CHAKMA SIGN VISARGA +11127..1112B ; Extend # Mn [5] CHAKMA VOWEL SIGN A..CHAKMA VOWEL SIGN UU +1112C ; Extend # Mc CHAKMA VOWEL SIGN E +1112D..11134 ; Extend # Mn [8] CHAKMA VOWEL SIGN AI..CHAKMA MAAYYAA +11145..11146 ; Extend # Mc [2] CHAKMA VOWEL SIGN AA..CHAKMA VOWEL SIGN EI +11173 ; Extend # Mn MAHAJANI SIGN NUKTA +11180..11181 ; Extend # Mn [2] SHARADA SIGN CANDRABINDU..SHARADA SIGN ANUSVARA +11182 ; Extend # Mc SHARADA SIGN VISARGA +111B3..111B5 ; Extend # Mc [3] SHARADA VOWEL SIGN AA..SHARADA VOWEL SIGN II +111B6..111BE ; Extend # Mn [9] SHARADA VOWEL SIGN U..SHARADA VOWEL SIGN O +111BF..111C0 ; Extend # Mc [2] SHARADA VOWEL SIGN AU..SHARADA SIGN VIRAMA +111C9..111CC ; Extend # Mn [4] SHARADA SANDHI MARK..SHARADA EXTRA SHORT VOWEL MARK +111CE ; Extend # Mc SHARADA VOWEL SIGN PRISHTHAMATRA E +111CF ; Extend # Mn SHARADA SIGN INVERTED CANDRABINDU +1122C..1122E ; Extend # Mc [3] KHOJKI VOWEL SIGN AA..KHOJKI VOWEL SIGN II +1122F..11231 ; Extend # Mn [3] KHOJKI VOWEL SIGN U..KHOJKI VOWEL SIGN AI +11232..11233 ; Extend # Mc [2] KHOJKI VOWEL SIGN O..KHOJKI VOWEL SIGN AU +11234 ; Extend # Mn KHOJKI SIGN ANUSVARA +11235 ; Extend # Mc KHOJKI SIGN VIRAMA +11236..11237 ; Extend # Mn [2] KHOJKI SIGN NUKTA..KHOJKI SIGN SHADDA +1123E ; Extend # Mn KHOJKI SIGN SUKUN +11241 ; Extend # Mn KHOJKI VOWEL SIGN VOCALIC R +112DF ; Extend # Mn KHUDAWADI SIGN ANUSVARA +112E0..112E2 ; Extend # Mc [3] KHUDAWADI VOWEL SIGN AA..KHUDAWADI VOWEL SIGN II +112E3..112EA ; Extend # Mn [8] KHUDAWADI VOWEL SIGN U..KHUDAWADI SIGN VIRAMA +11300..11301 ; Extend # Mn [2] GRANTHA SIGN COMBINING ANUSVARA ABOVE..GRANTHA SIGN CANDRABINDU +11302..11303 ; Extend # Mc [2] GRANTHA SIGN ANUSVARA..GRANTHA SIGN VISARGA +1133B..1133C ; Extend # Mn [2] COMBINING BINDU BELOW..GRANTHA SIGN NUKTA +1133E..1133F ; Extend # Mc [2] GRANTHA VOWEL SIGN AA..GRANTHA VOWEL SIGN I +11340 ; Extend # Mn GRANTHA VOWEL SIGN II +11341..11344 ; Extend # Mc [4] GRANTHA VOWEL SIGN U..GRANTHA VOWEL SIGN VOCALIC RR +11347..11348 ; Extend # Mc [2] GRANTHA VOWEL SIGN EE..GRANTHA VOWEL SIGN AI +1134B..1134D ; Extend # Mc [3] GRANTHA VOWEL SIGN OO..GRANTHA SIGN VIRAMA +11357 ; Extend # Mc GRANTHA AU LENGTH MARK +11362..11363 ; Extend # Mc [2] GRANTHA VOWEL SIGN VOCALIC L..GRANTHA VOWEL SIGN VOCALIC LL +11366..1136C ; Extend # Mn [7] COMBINING GRANTHA DIGIT ZERO..COMBINING GRANTHA DIGIT SIX +11370..11374 ; Extend # Mn [5] COMBINING GRANTHA LETTER A..COMBINING GRANTHA LETTER PA +113B8..113BA ; Extend # Mc [3] TULU-TIGALARI VOWEL SIGN AA..TULU-TIGALARI VOWEL SIGN II +113BB..113C0 ; Extend # Mn [6] TULU-TIGALARI VOWEL SIGN U..TULU-TIGALARI VOWEL SIGN VOCALIC LL +113C2 ; Extend # Mc TULU-TIGALARI VOWEL SIGN EE +113C5 ; Extend # Mc TULU-TIGALARI VOWEL SIGN AI +113C7..113CA ; Extend # Mc [4] TULU-TIGALARI VOWEL SIGN OO..TULU-TIGALARI SIGN CANDRA ANUNASIKA +113CC..113CD ; Extend # Mc [2] TULU-TIGALARI SIGN ANUSVARA..TULU-TIGALARI SIGN VISARGA +113CE ; Extend # Mn TULU-TIGALARI SIGN VIRAMA +113CF ; Extend # Mc TULU-TIGALARI SIGN LOOPED VIRAMA +113D0 ; Extend # Mn TULU-TIGALARI CONJOINER +113D2 ; Extend # Mn TULU-TIGALARI GEMINATION MARK +113E1..113E2 ; Extend # Mn [2] TULU-TIGALARI VEDIC TONE SVARITA..TULU-TIGALARI VEDIC TONE ANUDATTA +11435..11437 ; Extend # Mc [3] NEWA VOWEL SIGN AA..NEWA VOWEL SIGN II +11438..1143F ; Extend # Mn [8] NEWA VOWEL SIGN U..NEWA VOWEL SIGN AI +11440..11441 ; Extend # Mc [2] NEWA VOWEL SIGN O..NEWA VOWEL SIGN AU +11442..11444 ; Extend # Mn [3] NEWA SIGN VIRAMA..NEWA SIGN ANUSVARA +11445 ; Extend # Mc NEWA SIGN VISARGA +11446 ; Extend # Mn NEWA SIGN NUKTA +1145E ; Extend # Mn NEWA SANDHI MARK +114B0..114B2 ; Extend # Mc [3] TIRHUTA VOWEL SIGN AA..TIRHUTA VOWEL SIGN II +114B3..114B8 ; Extend # Mn [6] TIRHUTA VOWEL SIGN U..TIRHUTA VOWEL SIGN VOCALIC LL +114B9 ; Extend # Mc TIRHUTA VOWEL SIGN E +114BA ; Extend # Mn TIRHUTA VOWEL SIGN SHORT E +114BB..114BE ; Extend # Mc [4] TIRHUTA VOWEL SIGN AI..TIRHUTA VOWEL SIGN AU +114BF..114C0 ; Extend # Mn [2] TIRHUTA SIGN CANDRABINDU..TIRHUTA SIGN ANUSVARA +114C1 ; Extend # Mc TIRHUTA SIGN VISARGA +114C2..114C3 ; Extend # Mn [2] TIRHUTA SIGN VIRAMA..TIRHUTA SIGN NUKTA +115AF..115B1 ; Extend # Mc [3] SIDDHAM VOWEL SIGN AA..SIDDHAM VOWEL SIGN II +115B2..115B5 ; Extend # Mn [4] SIDDHAM VOWEL SIGN U..SIDDHAM VOWEL SIGN VOCALIC RR +115B8..115BB ; Extend # Mc [4] SIDDHAM VOWEL SIGN E..SIDDHAM VOWEL SIGN AU +115BC..115BD ; Extend # Mn [2] SIDDHAM SIGN CANDRABINDU..SIDDHAM SIGN ANUSVARA +115BE ; Extend # Mc SIDDHAM SIGN VISARGA +115BF..115C0 ; Extend # Mn [2] SIDDHAM SIGN VIRAMA..SIDDHAM SIGN NUKTA +115DC..115DD ; Extend # Mn [2] SIDDHAM VOWEL SIGN ALTERNATE U..SIDDHAM VOWEL SIGN ALTERNATE UU +11630..11632 ; Extend # Mc [3] MODI VOWEL SIGN AA..MODI VOWEL SIGN II +11633..1163A ; Extend # Mn [8] MODI VOWEL SIGN U..MODI VOWEL SIGN AI +1163B..1163C ; Extend # Mc [2] MODI VOWEL SIGN O..MODI VOWEL SIGN AU +1163D ; Extend # Mn MODI SIGN ANUSVARA +1163E ; Extend # Mc MODI SIGN VISARGA +1163F..11640 ; Extend # Mn [2] MODI SIGN VIRAMA..MODI SIGN ARDHACANDRA +116AB ; Extend # Mn TAKRI SIGN ANUSVARA +116AC ; Extend # Mc TAKRI SIGN VISARGA +116AD ; Extend # Mn TAKRI VOWEL SIGN AA +116AE..116AF ; Extend # Mc [2] TAKRI VOWEL SIGN I..TAKRI VOWEL SIGN II +116B0..116B5 ; Extend # Mn [6] TAKRI VOWEL SIGN U..TAKRI VOWEL SIGN AU +116B6 ; Extend # Mc TAKRI SIGN VIRAMA +116B7 ; Extend # Mn TAKRI SIGN NUKTA +1171D ; Extend # Mn AHOM CONSONANT SIGN MEDIAL LA +1171E ; Extend # Mc AHOM CONSONANT SIGN MEDIAL RA +1171F ; Extend # Mn AHOM CONSONANT SIGN MEDIAL LIGATING RA +11720..11721 ; Extend # Mc [2] AHOM VOWEL SIGN A..AHOM VOWEL SIGN AA +11722..11725 ; Extend # Mn [4] AHOM VOWEL SIGN I..AHOM VOWEL SIGN UU +11726 ; Extend # Mc AHOM VOWEL SIGN E +11727..1172B ; Extend # Mn [5] AHOM VOWEL SIGN AW..AHOM SIGN KILLER +1182C..1182E ; Extend # Mc [3] DOGRA VOWEL SIGN AA..DOGRA VOWEL SIGN II +1182F..11837 ; Extend # Mn [9] DOGRA VOWEL SIGN U..DOGRA SIGN ANUSVARA +11838 ; Extend # Mc DOGRA SIGN VISARGA +11839..1183A ; Extend # Mn [2] DOGRA SIGN VIRAMA..DOGRA SIGN NUKTA +11930..11935 ; Extend # Mc [6] DIVES AKURU VOWEL SIGN AA..DIVES AKURU VOWEL SIGN E +11937..11938 ; Extend # Mc [2] DIVES AKURU VOWEL SIGN AI..DIVES AKURU VOWEL SIGN O +1193B..1193C ; Extend # Mn [2] DIVES AKURU SIGN ANUSVARA..DIVES AKURU SIGN CANDRABINDU +1193D ; Extend # Mc DIVES AKURU SIGN HALANTA +1193E ; Extend # Mn DIVES AKURU VIRAMA +11940 ; Extend # Mc DIVES AKURU MEDIAL YA +11942 ; Extend # Mc DIVES AKURU MEDIAL RA +11943 ; Extend # Mn DIVES AKURU SIGN NUKTA +119D1..119D3 ; Extend # Mc [3] NANDINAGARI VOWEL SIGN AA..NANDINAGARI VOWEL SIGN II +119D4..119D7 ; Extend # Mn [4] NANDINAGARI VOWEL SIGN U..NANDINAGARI VOWEL SIGN VOCALIC RR +119DA..119DB ; Extend # Mn [2] NANDINAGARI VOWEL SIGN E..NANDINAGARI VOWEL SIGN AI +119DC..119DF ; Extend # Mc [4] NANDINAGARI VOWEL SIGN O..NANDINAGARI SIGN VISARGA +119E0 ; Extend # Mn NANDINAGARI SIGN VIRAMA +119E4 ; Extend # Mc NANDINAGARI VOWEL SIGN PRISHTHAMATRA E +11A01..11A0A ; Extend # Mn [10] ZANABAZAR SQUARE VOWEL SIGN I..ZANABAZAR SQUARE VOWEL LENGTH MARK +11A33..11A38 ; Extend # Mn [6] ZANABAZAR SQUARE FINAL CONSONANT MARK..ZANABAZAR SQUARE SIGN ANUSVARA +11A39 ; Extend # Mc ZANABAZAR SQUARE SIGN VISARGA +11A3B..11A3E ; Extend # Mn [4] ZANABAZAR SQUARE CLUSTER-FINAL LETTER YA..ZANABAZAR SQUARE CLUSTER-FINAL LETTER VA +11A47 ; Extend # Mn ZANABAZAR SQUARE SUBJOINER +11A51..11A56 ; Extend # Mn [6] SOYOMBO VOWEL SIGN I..SOYOMBO VOWEL SIGN OE +11A57..11A58 ; Extend # Mc [2] SOYOMBO VOWEL SIGN AI..SOYOMBO VOWEL SIGN AU +11A59..11A5B ; Extend # Mn [3] SOYOMBO VOWEL SIGN VOCALIC R..SOYOMBO VOWEL LENGTH MARK +11A8A..11A96 ; Extend # Mn [13] SOYOMBO FINAL CONSONANT SIGN G..SOYOMBO SIGN ANUSVARA +11A97 ; Extend # Mc SOYOMBO SIGN VISARGA +11A98..11A99 ; Extend # Mn [2] SOYOMBO GEMINATION MARK..SOYOMBO SUBJOINER +11B60 ; Extend # Mn SHARADA VOWEL SIGN OE +11B61 ; Extend # Mc SHARADA VOWEL SIGN OOE +11B62..11B64 ; Extend # Mn [3] SHARADA VOWEL SIGN UE..SHARADA VOWEL SIGN SHORT E +11B65 ; Extend # Mc SHARADA VOWEL SIGN SHORT O +11B66 ; Extend # Mn SHARADA VOWEL SIGN CANDRA E +11B67 ; Extend # Mc SHARADA VOWEL SIGN CANDRA O +11C2F ; Extend # Mc BHAIKSUKI VOWEL SIGN AA +11C30..11C36 ; Extend # Mn [7] BHAIKSUKI VOWEL SIGN I..BHAIKSUKI VOWEL SIGN VOCALIC L +11C38..11C3D ; Extend # Mn [6] BHAIKSUKI VOWEL SIGN E..BHAIKSUKI SIGN ANUSVARA +11C3E ; Extend # Mc BHAIKSUKI SIGN VISARGA +11C3F ; Extend # Mn BHAIKSUKI SIGN VIRAMA +11C92..11CA7 ; Extend # Mn [22] MARCHEN SUBJOINED LETTER KA..MARCHEN SUBJOINED LETTER ZA +11CA9 ; Extend # Mc MARCHEN SUBJOINED LETTER YA +11CAA..11CB0 ; Extend # Mn [7] MARCHEN SUBJOINED LETTER RA..MARCHEN VOWEL SIGN AA +11CB1 ; Extend # Mc MARCHEN VOWEL SIGN I +11CB2..11CB3 ; Extend # Mn [2] MARCHEN VOWEL SIGN U..MARCHEN VOWEL SIGN E +11CB4 ; Extend # Mc MARCHEN VOWEL SIGN O +11CB5..11CB6 ; Extend # Mn [2] MARCHEN SIGN ANUSVARA..MARCHEN SIGN CANDRABINDU +11D31..11D36 ; Extend # Mn [6] MASARAM GONDI VOWEL SIGN AA..MASARAM GONDI VOWEL SIGN VOCALIC R +11D3A ; Extend # Mn MASARAM GONDI VOWEL SIGN E +11D3C..11D3D ; Extend # Mn [2] MASARAM GONDI VOWEL SIGN AI..MASARAM GONDI VOWEL SIGN O +11D3F..11D45 ; Extend # Mn [7] MASARAM GONDI VOWEL SIGN AU..MASARAM GONDI VIRAMA +11D47 ; Extend # Mn MASARAM GONDI RA-KARA +11D8A..11D8E ; Extend # Mc [5] GUNJALA GONDI VOWEL SIGN AA..GUNJALA GONDI VOWEL SIGN UU +11D90..11D91 ; Extend # Mn [2] GUNJALA GONDI VOWEL SIGN EE..GUNJALA GONDI VOWEL SIGN AI +11D93..11D94 ; Extend # Mc [2] GUNJALA GONDI VOWEL SIGN OO..GUNJALA GONDI VOWEL SIGN AU +11D95 ; Extend # Mn GUNJALA GONDI SIGN ANUSVARA +11D96 ; Extend # Mc GUNJALA GONDI SIGN VISARGA +11D97 ; Extend # Mn GUNJALA GONDI VIRAMA +11EF3..11EF4 ; Extend # Mn [2] MAKASAR VOWEL SIGN I..MAKASAR VOWEL SIGN U +11EF5..11EF6 ; Extend # Mc [2] MAKASAR VOWEL SIGN E..MAKASAR VOWEL SIGN O +11F00..11F01 ; Extend # Mn [2] KAWI SIGN CANDRABINDU..KAWI SIGN ANUSVARA +11F03 ; Extend # Mc KAWI SIGN VISARGA +11F34..11F35 ; Extend # Mc [2] KAWI VOWEL SIGN AA..KAWI VOWEL SIGN ALTERNATE AA +11F36..11F3A ; Extend # Mn [5] KAWI VOWEL SIGN I..KAWI VOWEL SIGN VOCALIC R +11F3E..11F3F ; Extend # Mc [2] KAWI VOWEL SIGN E..KAWI VOWEL SIGN AI +11F40 ; Extend # Mn KAWI VOWEL SIGN EU +11F41 ; Extend # Mc KAWI SIGN KILLER +11F42 ; Extend # Mn KAWI CONJOINER +11F5A ; Extend # Mn KAWI SIGN NUKTA +13440 ; Extend # Mn EGYPTIAN HIEROGLYPH MIRROR HORIZONTALLY +13447..13455 ; Extend # Mn [15] EGYPTIAN HIEROGLYPH MODIFIER DAMAGED AT TOP START..EGYPTIAN HIEROGLYPH MODIFIER DAMAGED +1611E..16129 ; Extend # Mn [12] GURUNG KHEMA VOWEL SIGN AA..GURUNG KHEMA VOWEL LENGTH MARK +1612A..1612C ; Extend # Mc [3] GURUNG KHEMA CONSONANT SIGN MEDIAL YA..GURUNG KHEMA CONSONANT SIGN MEDIAL HA +1612D..1612F ; Extend # Mn [3] GURUNG KHEMA SIGN ANUSVARA..GURUNG KHEMA SIGN THOLHOMA +16AF0..16AF4 ; Extend # Mn [5] BASSA VAH COMBINING HIGH TONE..BASSA VAH COMBINING HIGH-LOW TONE +16B30..16B36 ; Extend # Mn [7] PAHAWH HMONG MARK CIM TUB..PAHAWH HMONG MARK CIM TAUM +16F4F ; Extend # Mn MIAO SIGN CONSONANT MODIFIER BAR +16F51..16F87 ; Extend # Mc [55] MIAO SIGN ASPIRATION..MIAO VOWEL SIGN UI +16F8F..16F92 ; Extend # Mn [4] MIAO TONE RIGHT..MIAO TONE BELOW +16FE4 ; Extend # Mn KHITAN SMALL SCRIPT FILLER +16FF0..16FF1 ; Extend # Mc [2] VIETNAMESE ALTERNATE READING MARK CA..VIETNAMESE ALTERNATE READING MARK NHAY +1BC9D..1BC9E ; Extend # Mn [2] DUPLOYAN THICK LETTER SELECTOR..DUPLOYAN DOUBLE MARK +1CF00..1CF2D ; Extend # Mn [46] ZNAMENNY COMBINING MARK GORAZDO NIZKO S KRYZHEM ON LEFT..ZNAMENNY COMBINING MARK KRYZH ON LEFT +1CF30..1CF46 ; Extend # Mn [23] ZNAMENNY COMBINING TONAL RANGE MARK MRACHNO..ZNAMENNY PRIZNAK MODIFIER ROG +1D165..1D166 ; Extend # Mc [2] MUSICAL SYMBOL COMBINING STEM..MUSICAL SYMBOL COMBINING SPRECHGESANG STEM +1D167..1D169 ; Extend # Mn [3] MUSICAL SYMBOL COMBINING TREMOLO-1..MUSICAL SYMBOL COMBINING TREMOLO-3 +1D16D..1D172 ; Extend # Mc [6] MUSICAL SYMBOL COMBINING AUGMENTATION DOT..MUSICAL SYMBOL COMBINING FLAG-5 +1D17B..1D182 ; Extend # Mn [8] MUSICAL SYMBOL COMBINING ACCENT..MUSICAL SYMBOL COMBINING LOURE +1D185..1D18B ; Extend # Mn [7] MUSICAL SYMBOL COMBINING DOIT..MUSICAL SYMBOL COMBINING TRIPLE TONGUE +1D1AA..1D1AD ; Extend # Mn [4] MUSICAL SYMBOL COMBINING DOWN BOW..MUSICAL SYMBOL COMBINING SNAP PIZZICATO +1D242..1D244 ; Extend # Mn [3] COMBINING GREEK MUSICAL TRISEME..COMBINING GREEK MUSICAL PENTASEME +1DA00..1DA36 ; Extend # Mn [55] SIGNWRITING HEAD RIM..SIGNWRITING AIR SUCKING IN +1DA3B..1DA6C ; Extend # Mn [50] SIGNWRITING MOUTH CLOSED NEUTRAL..SIGNWRITING EXCITEMENT +1DA75 ; Extend # Mn SIGNWRITING UPPER BODY TILTING FROM HIP JOINTS +1DA84 ; Extend # Mn SIGNWRITING LOCATION HEAD NECK +1DA9B..1DA9F ; Extend # Mn [5] SIGNWRITING FILL MODIFIER-2..SIGNWRITING FILL MODIFIER-6 +1DAA1..1DAAF ; Extend # Mn [15] SIGNWRITING ROTATION MODIFIER-2..SIGNWRITING ROTATION MODIFIER-16 +1E000..1E006 ; Extend # Mn [7] COMBINING GLAGOLITIC LETTER AZU..COMBINING GLAGOLITIC LETTER ZHIVETE +1E008..1E018 ; Extend # Mn [17] COMBINING GLAGOLITIC LETTER ZEMLJA..COMBINING GLAGOLITIC LETTER HERU +1E01B..1E021 ; Extend # Mn [7] COMBINING GLAGOLITIC LETTER SHTA..COMBINING GLAGOLITIC LETTER YATI +1E023..1E024 ; Extend # Mn [2] COMBINING GLAGOLITIC LETTER YU..COMBINING GLAGOLITIC LETTER SMALL YUS +1E026..1E02A ; Extend # Mn [5] COMBINING GLAGOLITIC LETTER YO..COMBINING GLAGOLITIC LETTER FITA +1E08F ; Extend # Mn COMBINING CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I +1E130..1E136 ; Extend # Mn [7] NYIAKENG PUACHUE HMONG TONE-B..NYIAKENG PUACHUE HMONG TONE-D +1E2AE ; Extend # Mn TOTO SIGN RISING TONE +1E2EC..1E2EF ; Extend # Mn [4] WANCHO TONE TUP..WANCHO TONE KOINI +1E4EC..1E4EF ; Extend # Mn [4] NAG MUNDARI SIGN MUHOR..NAG MUNDARI SIGN SUTUH +1E5EE..1E5EF ; Extend # Mn [2] OL ONAL SIGN MU..OL ONAL SIGN IKIR +1E6E3 ; Extend # Mn TAI YO SIGN UE +1E6E6 ; Extend # Mn TAI YO SIGN AU +1E6EE..1E6EF ; Extend # Mn [2] TAI YO SIGN AY..TAI YO SIGN ANG +1E6F5 ; Extend # Mn TAI YO SIGN OM +1E8D0..1E8D6 ; Extend # Mn [7] MENDE KIKAKUI COMBINING NUMBER TEENS..MENDE KIKAKUI COMBINING NUMBER MILLIONS +1E944..1E94A ; Extend # Mn [7] ADLAM ALIF LENGTHENER..ADLAM NUKTA +1F3FB..1F3FF ; Extend # Sk [5] EMOJI MODIFIER FITZPATRICK TYPE-1-2..EMOJI MODIFIER FITZPATRICK TYPE-6 +E0020..E007F ; Extend # Cf [96] TAG SPACE..CANCEL TAG +E0100..E01EF ; Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256 + +# Total code points: 2647 + +# ================================================ + +1F1E6..1F1FF ; Regional_Indicator # So [26] REGIONAL INDICATOR SYMBOL LETTER A..REGIONAL INDICATOR SYMBOL LETTER Z + +# Total code points: 26 + +# ================================================ + +00AD ; Format # Cf SOFT HYPHEN +061C ; Format # Cf ARABIC LETTER MARK +180E ; Format # Cf MONGOLIAN VOWEL SEPARATOR +200E..200F ; Format # Cf [2] LEFT-TO-RIGHT MARK..RIGHT-TO-LEFT MARK +202A..202E ; Format # Cf [5] LEFT-TO-RIGHT EMBEDDING..RIGHT-TO-LEFT OVERRIDE +2060..2064 ; Format # Cf [5] WORD JOINER..INVISIBLE PLUS +2066..206F ; Format # Cf [10] LEFT-TO-RIGHT ISOLATE..NOMINAL DIGIT SHAPES +FEFF ; Format # Cf ZERO WIDTH NO-BREAK SPACE +FFF9..FFFB ; Format # Cf [3] INTERLINEAR ANNOTATION ANCHOR..INTERLINEAR ANNOTATION TERMINATOR +13430..1343F ; Format # Cf [16] EGYPTIAN HIEROGLYPH VERTICAL JOINER..EGYPTIAN HIEROGLYPH END WALLED ENCLOSURE +1BCA0..1BCA3 ; Format # Cf [4] SHORTHAND FORMAT LETTER OVERLAP..SHORTHAND FORMAT UP STEP +1D173..1D17A ; Format # Cf [8] MUSICAL SYMBOL BEGIN BEAM..MUSICAL SYMBOL END PHRASE +E0001 ; Format # Cf LANGUAGE TAG + +# Total code points: 58 + +# ================================================ + +3031..3035 ; Katakana # Lm [5] VERTICAL KANA REPEAT MARK..VERTICAL KANA REPEAT MARK LOWER HALF +309B..309C ; Katakana # Sk [2] KATAKANA-HIRAGANA VOICED SOUND MARK..KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK +30A0 ; Katakana # Pd KATAKANA-HIRAGANA DOUBLE HYPHEN +30A1..30FA ; Katakana # Lo [90] KATAKANA LETTER SMALL A..KATAKANA LETTER VO +30FC..30FE ; Katakana # Lm [3] KATAKANA-HIRAGANA PROLONGED SOUND MARK..KATAKANA VOICED ITERATION MARK +30FF ; Katakana # Lo KATAKANA DIGRAPH KOTO +31F0..31FF ; Katakana # Lo [16] KATAKANA LETTER SMALL KU..KATAKANA LETTER SMALL RO +32D0..32FE ; Katakana # So [47] CIRCLED KATAKANA A..CIRCLED KATAKANA WO +3300..3357 ; Katakana # So [88] SQUARE APAATO..SQUARE WATTO +FF66..FF6F ; Katakana # Lo [10] HALFWIDTH KATAKANA LETTER WO..HALFWIDTH KATAKANA LETTER SMALL TU +FF70 ; Katakana # Lm HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK +FF71..FF9D ; Katakana # Lo [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAKANA LETTER N +1AFF0..1AFF3 ; Katakana # Lm [4] KATAKANA LETTER MINNAN TONE-2..KATAKANA LETTER MINNAN TONE-5 +1AFF5..1AFFB ; Katakana # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5 +1AFFD..1AFFE ; Katakana # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8 +1B000 ; Katakana # Lo KATAKANA LETTER ARCHAIC E +1B120..1B122 ; Katakana # Lo [3] KATAKANA LETTER ARCHAIC YI..KATAKANA LETTER ARCHAIC WU +1B155 ; Katakana # Lo KATAKANA LETTER SMALL KO +1B164..1B167 ; Katakana # Lo [4] KATAKANA LETTER SMALL WI..KATAKANA LETTER SMALL N + +# Total code points: 331 + +# ================================================ + +0041..005A ; ALetter # L& [26] LATIN CAPITAL LETTER A..LATIN CAPITAL LETTER Z +0061..007A ; ALetter # L& [26] LATIN SMALL LETTER A..LATIN SMALL LETTER Z +00AA ; ALetter # Lo FEMININE ORDINAL INDICATOR +00B5 ; ALetter # L& MICRO SIGN +00B8 ; ALetter # Sk CEDILLA +00BA ; ALetter # Lo MASCULINE ORDINAL INDICATOR +00C0..00D6 ; ALetter # L& [23] LATIN CAPITAL LETTER A WITH GRAVE..LATIN CAPITAL LETTER O WITH DIAERESIS +00D8..00F6 ; ALetter # L& [31] LATIN CAPITAL LETTER O WITH STROKE..LATIN SMALL LETTER O WITH DIAERESIS +00F8..01BA ; ALetter # L& [195] LATIN SMALL LETTER O WITH STROKE..LATIN SMALL LETTER EZH WITH TAIL +01BB ; ALetter # Lo LATIN LETTER TWO WITH STROKE +01BC..01BF ; ALetter # L& [4] LATIN CAPITAL LETTER TONE FIVE..LATIN LETTER WYNN +01C0..01C3 ; ALetter # Lo [4] LATIN LETTER DENTAL CLICK..LATIN LETTER RETROFLEX CLICK +01C4..0293 ; ALetter # L& [208] LATIN CAPITAL LETTER DZ WITH CARON..LATIN SMALL LETTER EZH WITH CURL +0294..0295 ; ALetter # Lo [2] LATIN LETTER GLOTTAL STOP..LATIN LETTER PHARYNGEAL VOICED FRICATIVE +0296..02AF ; ALetter # L& [26] LATIN LETTER INVERTED GLOTTAL STOP..LATIN SMALL LETTER TURNED H WITH FISHHOOK AND TAIL +02B0..02C1 ; ALetter # Lm [18] MODIFIER LETTER SMALL H..MODIFIER LETTER REVERSED GLOTTAL STOP +02C2..02C5 ; ALetter # Sk [4] MODIFIER LETTER LEFT ARROWHEAD..MODIFIER LETTER DOWN ARROWHEAD +02C6..02D1 ; ALetter # Lm [12] MODIFIER LETTER CIRCUMFLEX ACCENT..MODIFIER LETTER HALF TRIANGULAR COLON +02D2..02D7 ; ALetter # Sk [6] MODIFIER LETTER CENTRED RIGHT HALF RING..MODIFIER LETTER MINUS SIGN +02DE..02DF ; ALetter # Sk [2] MODIFIER LETTER RHOTIC HOOK..MODIFIER LETTER CROSS ACCENT +02E0..02E4 ; ALetter # Lm [5] MODIFIER LETTER SMALL GAMMA..MODIFIER LETTER SMALL REVERSED GLOTTAL STOP +02E5..02EB ; ALetter # Sk [7] MODIFIER LETTER EXTRA-HIGH TONE BAR..MODIFIER LETTER YANG DEPARTING TONE MARK +02EC ; ALetter # Lm MODIFIER LETTER VOICING +02ED ; ALetter # Sk MODIFIER LETTER UNASPIRATED +02EE ; ALetter # Lm MODIFIER LETTER DOUBLE APOSTROPHE +02EF..02FF ; ALetter # Sk [17] MODIFIER LETTER LOW DOWN ARROWHEAD..MODIFIER LETTER LOW LEFT ARROW +0370..0373 ; ALetter # L& [4] GREEK CAPITAL LETTER HETA..GREEK SMALL LETTER ARCHAIC SAMPI +0374 ; ALetter # Lm GREEK NUMERAL SIGN +0376..0377 ; ALetter # L& [2] GREEK CAPITAL LETTER PAMPHYLIAN DIGAMMA..GREEK SMALL LETTER PAMPHYLIAN DIGAMMA +037A ; ALetter # Lm GREEK YPOGEGRAMMENI +037B..037D ; ALetter # L& [3] GREEK SMALL REVERSED LUNATE SIGMA SYMBOL..GREEK SMALL REVERSED DOTTED LUNATE SIGMA SYMBOL +037F ; ALetter # L& GREEK CAPITAL LETTER YOT +0386 ; ALetter # L& GREEK CAPITAL LETTER ALPHA WITH TONOS +0388..038A ; ALetter # L& [3] GREEK CAPITAL LETTER EPSILON WITH TONOS..GREEK CAPITAL LETTER IOTA WITH TONOS +038C ; ALetter # L& GREEK CAPITAL LETTER OMICRON WITH TONOS +038E..03A1 ; ALetter # L& [20] GREEK CAPITAL LETTER UPSILON WITH TONOS..GREEK CAPITAL LETTER RHO +03A3..03F5 ; ALetter # L& [83] GREEK CAPITAL LETTER SIGMA..GREEK LUNATE EPSILON SYMBOL +03F7..0481 ; ALetter # L& [139] GREEK CAPITAL LETTER SHO..CYRILLIC SMALL LETTER KOPPA +048A..052F ; ALetter # L& [166] CYRILLIC CAPITAL LETTER SHORT I WITH TAIL..CYRILLIC SMALL LETTER EL WITH DESCENDER +0531..0556 ; ALetter # L& [38] ARMENIAN CAPITAL LETTER AYB..ARMENIAN CAPITAL LETTER FEH +0559 ; ALetter # Lm ARMENIAN MODIFIER LETTER LEFT HALF RING +055A..055C ; ALetter # Po [3] ARMENIAN APOSTROPHE..ARMENIAN EXCLAMATION MARK +055E ; ALetter # Po ARMENIAN QUESTION MARK +0560..0588 ; ALetter # L& [41] ARMENIAN SMALL LETTER TURNED AYB..ARMENIAN SMALL LETTER YI WITH STROKE +058A ; ALetter # Pd ARMENIAN HYPHEN +05F3 ; ALetter # Po HEBREW PUNCTUATION GERESH +0620..063F ; ALetter # Lo [32] ARABIC LETTER KASHMIRI YEH..ARABIC LETTER FARSI YEH WITH THREE DOTS ABOVE +0640 ; ALetter # Lm ARABIC TATWEEL +0641..064A ; ALetter # Lo [10] ARABIC LETTER FEH..ARABIC LETTER YEH +066E..066F ; ALetter # Lo [2] ARABIC LETTER DOTLESS BEH..ARABIC LETTER DOTLESS QAF +0671..06D3 ; ALetter # Lo [99] ARABIC LETTER ALEF WASLA..ARABIC LETTER YEH BARREE WITH HAMZA ABOVE +06D5 ; ALetter # Lo ARABIC LETTER AE +06E5..06E6 ; ALetter # Lm [2] ARABIC SMALL WAW..ARABIC SMALL YEH +06EE..06EF ; ALetter # Lo [2] ARABIC LETTER DAL WITH INVERTED V..ARABIC LETTER REH WITH INVERTED V +06FA..06FC ; ALetter # Lo [3] ARABIC LETTER SHEEN WITH DOT BELOW..ARABIC LETTER GHAIN WITH DOT BELOW +06FF ; ALetter # Lo ARABIC LETTER HEH WITH INVERTED V +070F ; ALetter # Cf SYRIAC ABBREVIATION MARK +0710 ; ALetter # Lo SYRIAC LETTER ALAPH +0712..072F ; ALetter # Lo [30] SYRIAC LETTER BETH..SYRIAC LETTER PERSIAN DHALATH +074D..07A5 ; ALetter # Lo [89] SYRIAC LETTER SOGDIAN ZHAIN..THAANA LETTER WAAVU +07B1 ; ALetter # Lo THAANA LETTER NAA +07CA..07EA ; ALetter # Lo [33] NKO LETTER A..NKO LETTER JONA RA +07F4..07F5 ; ALetter # Lm [2] NKO HIGH TONE APOSTROPHE..NKO LOW TONE APOSTROPHE +07FA ; ALetter # Lm NKO LAJANYALAN +0800..0815 ; ALetter # Lo [22] SAMARITAN LETTER ALAF..SAMARITAN LETTER TAAF +081A ; ALetter # Lm SAMARITAN MODIFIER LETTER EPENTHETIC YUT +0824 ; ALetter # Lm SAMARITAN MODIFIER LETTER SHORT A +0828 ; ALetter # Lm SAMARITAN MODIFIER LETTER I +0840..0858 ; ALetter # Lo [25] MANDAIC LETTER HALQA..MANDAIC LETTER AIN +0860..086A ; ALetter # Lo [11] SYRIAC LETTER MALAYALAM NGA..SYRIAC LETTER MALAYALAM SSA +0870..0887 ; ALetter # Lo [24] ARABIC LETTER ALEF WITH ATTACHED FATHA..ARABIC BASELINE ROUND DOT +0889..088F ; ALetter # Lo [7] ARABIC LETTER NOON WITH INVERTED SMALL V..ARABIC LETTER NOON WITH RING ABOVE +08A0..08C8 ; ALetter # Lo [41] ARABIC LETTER BEH WITH SMALL V BELOW..ARABIC LETTER GRAF +08C9 ; ALetter # Lm ARABIC SMALL FARSI YEH +0904..0939 ; ALetter # Lo [54] DEVANAGARI LETTER SHORT A..DEVANAGARI LETTER HA +093D ; ALetter # Lo DEVANAGARI SIGN AVAGRAHA +0950 ; ALetter # Lo DEVANAGARI OM +0958..0961 ; ALetter # Lo [10] DEVANAGARI LETTER QA..DEVANAGARI LETTER VOCALIC LL +0971 ; ALetter # Lm DEVANAGARI SIGN HIGH SPACING DOT +0972..0980 ; ALetter # Lo [15] DEVANAGARI LETTER CANDRA A..BENGALI ANJI +0985..098C ; ALetter # Lo [8] BENGALI LETTER A..BENGALI LETTER VOCALIC L +098F..0990 ; ALetter # Lo [2] BENGALI LETTER E..BENGALI LETTER AI +0993..09A8 ; ALetter # Lo [22] BENGALI LETTER O..BENGALI LETTER NA +09AA..09B0 ; ALetter # Lo [7] BENGALI LETTER PA..BENGALI LETTER RA +09B2 ; ALetter # Lo BENGALI LETTER LA +09B6..09B9 ; ALetter # Lo [4] BENGALI LETTER SHA..BENGALI LETTER HA +09BD ; ALetter # Lo BENGALI SIGN AVAGRAHA +09CE ; ALetter # Lo BENGALI LETTER KHANDA TA +09DC..09DD ; ALetter # Lo [2] BENGALI LETTER RRA..BENGALI LETTER RHA +09DF..09E1 ; ALetter # Lo [3] BENGALI LETTER YYA..BENGALI LETTER VOCALIC LL +09F0..09F1 ; ALetter # Lo [2] BENGALI LETTER RA WITH MIDDLE DIAGONAL..BENGALI LETTER RA WITH LOWER DIAGONAL +09FC ; ALetter # Lo BENGALI LETTER VEDIC ANUSVARA +0A05..0A0A ; ALetter # Lo [6] GURMUKHI LETTER A..GURMUKHI LETTER UU +0A0F..0A10 ; ALetter # Lo [2] GURMUKHI LETTER EE..GURMUKHI LETTER AI +0A13..0A28 ; ALetter # Lo [22] GURMUKHI LETTER OO..GURMUKHI LETTER NA +0A2A..0A30 ; ALetter # Lo [7] GURMUKHI LETTER PA..GURMUKHI LETTER RA +0A32..0A33 ; ALetter # Lo [2] GURMUKHI LETTER LA..GURMUKHI LETTER LLA +0A35..0A36 ; ALetter # Lo [2] GURMUKHI LETTER VA..GURMUKHI LETTER SHA +0A38..0A39 ; ALetter # Lo [2] GURMUKHI LETTER SA..GURMUKHI LETTER HA +0A59..0A5C ; ALetter # Lo [4] GURMUKHI LETTER KHHA..GURMUKHI LETTER RRA +0A5E ; ALetter # Lo GURMUKHI LETTER FA +0A72..0A74 ; ALetter # Lo [3] GURMUKHI IRI..GURMUKHI EK ONKAR +0A85..0A8D ; ALetter # Lo [9] GUJARATI LETTER A..GUJARATI VOWEL CANDRA E +0A8F..0A91 ; ALetter # Lo [3] GUJARATI LETTER E..GUJARATI VOWEL CANDRA O +0A93..0AA8 ; ALetter # Lo [22] GUJARATI LETTER O..GUJARATI LETTER NA +0AAA..0AB0 ; ALetter # Lo [7] GUJARATI LETTER PA..GUJARATI LETTER RA +0AB2..0AB3 ; ALetter # Lo [2] GUJARATI LETTER LA..GUJARATI LETTER LLA +0AB5..0AB9 ; ALetter # Lo [5] GUJARATI LETTER VA..GUJARATI LETTER HA +0ABD ; ALetter # Lo GUJARATI SIGN AVAGRAHA +0AD0 ; ALetter # Lo GUJARATI OM +0AE0..0AE1 ; ALetter # Lo [2] GUJARATI LETTER VOCALIC RR..GUJARATI LETTER VOCALIC LL +0AF9 ; ALetter # Lo GUJARATI LETTER ZHA +0B05..0B0C ; ALetter # Lo [8] ORIYA LETTER A..ORIYA LETTER VOCALIC L +0B0F..0B10 ; ALetter # Lo [2] ORIYA LETTER E..ORIYA LETTER AI +0B13..0B28 ; ALetter # Lo [22] ORIYA LETTER O..ORIYA LETTER NA +0B2A..0B30 ; ALetter # Lo [7] ORIYA LETTER PA..ORIYA LETTER RA +0B32..0B33 ; ALetter # Lo [2] ORIYA LETTER LA..ORIYA LETTER LLA +0B35..0B39 ; ALetter # Lo [5] ORIYA LETTER VA..ORIYA LETTER HA +0B3D ; ALetter # Lo ORIYA SIGN AVAGRAHA +0B5C..0B5D ; ALetter # Lo [2] ORIYA LETTER RRA..ORIYA LETTER RHA +0B5F..0B61 ; ALetter # Lo [3] ORIYA LETTER YYA..ORIYA LETTER VOCALIC LL +0B71 ; ALetter # Lo ORIYA LETTER WA +0B83 ; ALetter # Lo TAMIL SIGN VISARGA +0B85..0B8A ; ALetter # Lo [6] TAMIL LETTER A..TAMIL LETTER UU +0B8E..0B90 ; ALetter # Lo [3] TAMIL LETTER E..TAMIL LETTER AI +0B92..0B95 ; ALetter # Lo [4] TAMIL LETTER O..TAMIL LETTER KA +0B99..0B9A ; ALetter # Lo [2] TAMIL LETTER NGA..TAMIL LETTER CA +0B9C ; ALetter # Lo TAMIL LETTER JA +0B9E..0B9F ; ALetter # Lo [2] TAMIL LETTER NYA..TAMIL LETTER TTA +0BA3..0BA4 ; ALetter # Lo [2] TAMIL LETTER NNA..TAMIL LETTER TA +0BA8..0BAA ; ALetter # Lo [3] TAMIL LETTER NA..TAMIL LETTER PA +0BAE..0BB9 ; ALetter # Lo [12] TAMIL LETTER MA..TAMIL LETTER HA +0BD0 ; ALetter # Lo TAMIL OM +0C05..0C0C ; ALetter # Lo [8] TELUGU LETTER A..TELUGU LETTER VOCALIC L +0C0E..0C10 ; ALetter # Lo [3] TELUGU LETTER E..TELUGU LETTER AI +0C12..0C28 ; ALetter # Lo [23] TELUGU LETTER O..TELUGU LETTER NA +0C2A..0C39 ; ALetter # Lo [16] TELUGU LETTER PA..TELUGU LETTER HA +0C3D ; ALetter # Lo TELUGU SIGN AVAGRAHA +0C58..0C5A ; ALetter # Lo [3] TELUGU LETTER TSA..TELUGU LETTER RRRA +0C5C..0C5D ; ALetter # Lo [2] TELUGU ARCHAIC SHRII..TELUGU LETTER NAKAARA POLLU +0C60..0C61 ; ALetter # Lo [2] TELUGU LETTER VOCALIC RR..TELUGU LETTER VOCALIC LL +0C80 ; ALetter # Lo KANNADA SIGN SPACING CANDRABINDU +0C85..0C8C ; ALetter # Lo [8] KANNADA LETTER A..KANNADA LETTER VOCALIC L +0C8E..0C90 ; ALetter # Lo [3] KANNADA LETTER E..KANNADA LETTER AI +0C92..0CA8 ; ALetter # Lo [23] KANNADA LETTER O..KANNADA LETTER NA +0CAA..0CB3 ; ALetter # Lo [10] KANNADA LETTER PA..KANNADA LETTER LLA +0CB5..0CB9 ; ALetter # Lo [5] KANNADA LETTER VA..KANNADA LETTER HA +0CBD ; ALetter # Lo KANNADA SIGN AVAGRAHA +0CDC..0CDE ; ALetter # Lo [3] KANNADA ARCHAIC SHRII..KANNADA LETTER FA +0CE0..0CE1 ; ALetter # Lo [2] KANNADA LETTER VOCALIC RR..KANNADA LETTER VOCALIC LL +0CF1..0CF2 ; ALetter # Lo [2] KANNADA SIGN JIHVAMULIYA..KANNADA SIGN UPADHMANIYA +0D04..0D0C ; ALetter # Lo [9] MALAYALAM LETTER VEDIC ANUSVARA..MALAYALAM LETTER VOCALIC L +0D0E..0D10 ; ALetter # Lo [3] MALAYALAM LETTER E..MALAYALAM LETTER AI +0D12..0D3A ; ALetter # Lo [41] MALAYALAM LETTER O..MALAYALAM LETTER TTTA +0D3D ; ALetter # Lo MALAYALAM SIGN AVAGRAHA +0D4E ; ALetter # Lo MALAYALAM LETTER DOT REPH +0D54..0D56 ; ALetter # Lo [3] MALAYALAM LETTER CHILLU M..MALAYALAM LETTER CHILLU LLL +0D5F..0D61 ; ALetter # Lo [3] MALAYALAM LETTER ARCHAIC II..MALAYALAM LETTER VOCALIC LL +0D7A..0D7F ; ALetter # Lo [6] MALAYALAM LETTER CHILLU NN..MALAYALAM LETTER CHILLU K +0D85..0D96 ; ALetter # Lo [18] SINHALA LETTER AYANNA..SINHALA LETTER AUYANNA +0D9A..0DB1 ; ALetter # Lo [24] SINHALA LETTER ALPAPRAANA KAYANNA..SINHALA LETTER DANTAJA NAYANNA +0DB3..0DBB ; ALetter # Lo [9] SINHALA LETTER SANYAKA DAYANNA..SINHALA LETTER RAYANNA +0DBD ; ALetter # Lo SINHALA LETTER DANTAJA LAYANNA +0DC0..0DC6 ; ALetter # Lo [7] SINHALA LETTER VAYANNA..SINHALA LETTER FAYANNA +0F00 ; ALetter # Lo TIBETAN SYLLABLE OM +0F40..0F47 ; ALetter # Lo [8] TIBETAN LETTER KA..TIBETAN LETTER JA +0F49..0F6C ; ALetter # Lo [36] TIBETAN LETTER NYA..TIBETAN LETTER RRA +0F88..0F8C ; ALetter # Lo [5] TIBETAN SIGN LCE TSA CAN..TIBETAN SIGN INVERTED MCHU CAN +10A0..10C5 ; ALetter # L& [38] GEORGIAN CAPITAL LETTER AN..GEORGIAN CAPITAL LETTER HOE +10C7 ; ALetter # L& GEORGIAN CAPITAL LETTER YN +10CD ; ALetter # L& GEORGIAN CAPITAL LETTER AEN +10D0..10FA ; ALetter # L& [43] GEORGIAN LETTER AN..GEORGIAN LETTER AIN +10FC ; ALetter # Lm MODIFIER LETTER GEORGIAN NAR +10FD..10FF ; ALetter # L& [3] GEORGIAN LETTER AEN..GEORGIAN LETTER LABIAL SIGN +1100..1248 ; ALetter # Lo [329] HANGUL CHOSEONG KIYEOK..ETHIOPIC SYLLABLE QWA +124A..124D ; ALetter # Lo [4] ETHIOPIC SYLLABLE QWI..ETHIOPIC SYLLABLE QWE +1250..1256 ; ALetter # Lo [7] ETHIOPIC SYLLABLE QHA..ETHIOPIC SYLLABLE QHO +1258 ; ALetter # Lo ETHIOPIC SYLLABLE QHWA +125A..125D ; ALetter # Lo [4] ETHIOPIC SYLLABLE QHWI..ETHIOPIC SYLLABLE QHWE +1260..1288 ; ALetter # Lo [41] ETHIOPIC SYLLABLE BA..ETHIOPIC SYLLABLE XWA +128A..128D ; ALetter # Lo [4] ETHIOPIC SYLLABLE XWI..ETHIOPIC SYLLABLE XWE +1290..12B0 ; ALetter # Lo [33] ETHIOPIC SYLLABLE NA..ETHIOPIC SYLLABLE KWA +12B2..12B5 ; ALetter # Lo [4] ETHIOPIC SYLLABLE KWI..ETHIOPIC SYLLABLE KWE +12B8..12BE ; ALetter # Lo [7] ETHIOPIC SYLLABLE KXA..ETHIOPIC SYLLABLE KXO +12C0 ; ALetter # Lo ETHIOPIC SYLLABLE KXWA +12C2..12C5 ; ALetter # Lo [4] ETHIOPIC SYLLABLE KXWI..ETHIOPIC SYLLABLE KXWE +12C8..12D6 ; ALetter # Lo [15] ETHIOPIC SYLLABLE WA..ETHIOPIC SYLLABLE PHARYNGEAL O +12D8..1310 ; ALetter # Lo [57] ETHIOPIC SYLLABLE ZA..ETHIOPIC SYLLABLE GWA +1312..1315 ; ALetter # Lo [4] ETHIOPIC SYLLABLE GWI..ETHIOPIC SYLLABLE GWE +1318..135A ; ALetter # Lo [67] ETHIOPIC SYLLABLE GGA..ETHIOPIC SYLLABLE FYA +1380..138F ; ALetter # Lo [16] ETHIOPIC SYLLABLE SEBATBEIT MWA..ETHIOPIC SYLLABLE PWE +13A0..13F5 ; ALetter # L& [86] CHEROKEE LETTER A..CHEROKEE LETTER MV +13F8..13FD ; ALetter # L& [6] CHEROKEE SMALL LETTER YE..CHEROKEE SMALL LETTER MV +1401..166C ; ALetter # Lo [620] CANADIAN SYLLABICS E..CANADIAN SYLLABICS CARRIER TTSA +166F..167F ; ALetter # Lo [17] CANADIAN SYLLABICS QAI..CANADIAN SYLLABICS BLACKFOOT W +1681..169A ; ALetter # Lo [26] OGHAM LETTER BEITH..OGHAM LETTER PEITH +16A0..16EA ; ALetter # Lo [75] RUNIC LETTER FEHU FEOH FE F..RUNIC LETTER X +16EE..16F0 ; ALetter # Nl [3] RUNIC ARLAUG SYMBOL..RUNIC BELGTHOR SYMBOL +16F1..16F8 ; ALetter # Lo [8] RUNIC LETTER K..RUNIC LETTER FRANKS CASKET AESC +1700..1711 ; ALetter # Lo [18] TAGALOG LETTER A..TAGALOG LETTER HA +171F..1731 ; ALetter # Lo [19] TAGALOG LETTER ARCHAIC RA..HANUNOO LETTER HA +1740..1751 ; ALetter # Lo [18] BUHID LETTER A..BUHID LETTER HA +1760..176C ; ALetter # Lo [13] TAGBANWA LETTER A..TAGBANWA LETTER YA +176E..1770 ; ALetter # Lo [3] TAGBANWA LETTER LA..TAGBANWA LETTER SA +1820..1842 ; ALetter # Lo [35] MONGOLIAN LETTER A..MONGOLIAN LETTER CHI +1843 ; ALetter # Lm MONGOLIAN LETTER TODO LONG VOWEL SIGN +1844..1878 ; ALetter # Lo [53] MONGOLIAN LETTER TODO E..MONGOLIAN LETTER CHA WITH TWO DOTS +1880..1884 ; ALetter # Lo [5] MONGOLIAN LETTER ALI GALI ANUSVARA ONE..MONGOLIAN LETTER ALI GALI INVERTED UBADAMA +1887..18A8 ; ALetter # Lo [34] MONGOLIAN LETTER ALI GALI A..MONGOLIAN LETTER MANCHU ALI GALI BHA +18AA ; ALetter # Lo MONGOLIAN LETTER MANCHU ALI GALI LHA +18B0..18F5 ; ALetter # Lo [70] CANADIAN SYLLABICS OY..CANADIAN SYLLABICS CARRIER DENTAL S +1900..191E ; ALetter # Lo [31] LIMBU VOWEL-CARRIER LETTER..LIMBU LETTER TRA +1A00..1A16 ; ALetter # Lo [23] BUGINESE LETTER KA..BUGINESE LETTER HA +1B05..1B33 ; ALetter # Lo [47] BALINESE LETTER AKARA..BALINESE LETTER HA +1B45..1B4C ; ALetter # Lo [8] BALINESE LETTER KAF SASAK..BALINESE LETTER ARCHAIC JNYA +1B83..1BA0 ; ALetter # Lo [30] SUNDANESE LETTER A..SUNDANESE LETTER HA +1BAE..1BAF ; ALetter # Lo [2] SUNDANESE LETTER KHA..SUNDANESE LETTER SYA +1BBA..1BE5 ; ALetter # Lo [44] SUNDANESE AVAGRAHA..BATAK LETTER U +1C00..1C23 ; ALetter # Lo [36] LEPCHA LETTER KA..LEPCHA LETTER A +1C4D..1C4F ; ALetter # Lo [3] LEPCHA LETTER TTA..LEPCHA LETTER DDA +1C5A..1C77 ; ALetter # Lo [30] OL CHIKI LETTER LA..OL CHIKI LETTER OH +1C78..1C7D ; ALetter # Lm [6] OL CHIKI MU TTUDDAG..OL CHIKI AHAD +1C80..1C8A ; ALetter # L& [11] CYRILLIC SMALL LETTER ROUNDED VE..CYRILLIC SMALL LETTER TJE +1C90..1CBA ; ALetter # L& [43] GEORGIAN MTAVRULI CAPITAL LETTER AN..GEORGIAN MTAVRULI CAPITAL LETTER AIN +1CBD..1CBF ; ALetter # L& [3] GEORGIAN MTAVRULI CAPITAL LETTER AEN..GEORGIAN MTAVRULI CAPITAL LETTER LABIAL SIGN +1CE9..1CEC ; ALetter # Lo [4] VEDIC SIGN ANUSVARA ANTARGOMUKHA..VEDIC SIGN ANUSVARA VAMAGOMUKHA WITH TAIL +1CEE..1CF3 ; ALetter # Lo [6] VEDIC SIGN HEXIFORM LONG ANUSVARA..VEDIC SIGN ROTATED ARDHAVISARGA +1CF5..1CF6 ; ALetter # Lo [2] VEDIC SIGN JIHVAMULIYA..VEDIC SIGN UPADHMANIYA +1CFA ; ALetter # Lo VEDIC SIGN DOUBLE ANUSVARA ANTARGOMUKHA +1D00..1D2B ; ALetter # L& [44] LATIN LETTER SMALL CAPITAL A..CYRILLIC LETTER SMALL CAPITAL EL +1D2C..1D6A ; ALetter # Lm [63] MODIFIER LETTER CAPITAL A..GREEK SUBSCRIPT SMALL LETTER CHI +1D6B..1D77 ; ALetter # L& [13] LATIN SMALL LETTER UE..LATIN SMALL LETTER TURNED G +1D78 ; ALetter # Lm MODIFIER LETTER CYRILLIC EN +1D79..1D9A ; ALetter # L& [34] LATIN SMALL LETTER INSULAR G..LATIN SMALL LETTER EZH WITH RETROFLEX HOOK +1D9B..1DBF ; ALetter # Lm [37] MODIFIER LETTER SMALL TURNED ALPHA..MODIFIER LETTER SMALL THETA +1E00..1F15 ; ALetter # L& [278] LATIN CAPITAL LETTER A WITH RING BELOW..GREEK SMALL LETTER EPSILON WITH DASIA AND OXIA +1F18..1F1D ; ALetter # L& [6] GREEK CAPITAL LETTER EPSILON WITH PSILI..GREEK CAPITAL LETTER EPSILON WITH DASIA AND OXIA +1F20..1F45 ; ALetter # L& [38] GREEK SMALL LETTER ETA WITH PSILI..GREEK SMALL LETTER OMICRON WITH DASIA AND OXIA +1F48..1F4D ; ALetter # L& [6] GREEK CAPITAL LETTER OMICRON WITH PSILI..GREEK CAPITAL LETTER OMICRON WITH DASIA AND OXIA +1F50..1F57 ; ALetter # L& [8] GREEK SMALL LETTER UPSILON WITH PSILI..GREEK SMALL LETTER UPSILON WITH DASIA AND PERISPOMENI +1F59 ; ALetter # L& GREEK CAPITAL LETTER UPSILON WITH DASIA +1F5B ; ALetter # L& GREEK CAPITAL LETTER UPSILON WITH DASIA AND VARIA +1F5D ; ALetter # L& GREEK CAPITAL LETTER UPSILON WITH DASIA AND OXIA +1F5F..1F7D ; ALetter # L& [31] GREEK CAPITAL LETTER UPSILON WITH DASIA AND PERISPOMENI..GREEK SMALL LETTER OMEGA WITH OXIA +1F80..1FB4 ; ALetter # L& [53] GREEK SMALL LETTER ALPHA WITH PSILI AND YPOGEGRAMMENI..GREEK SMALL LETTER ALPHA WITH OXIA AND YPOGEGRAMMENI +1FB6..1FBC ; ALetter # L& [7] GREEK SMALL LETTER ALPHA WITH PERISPOMENI..GREEK CAPITAL LETTER ALPHA WITH PROSGEGRAMMENI +1FBE ; ALetter # L& GREEK PROSGEGRAMMENI +1FC2..1FC4 ; ALetter # L& [3] GREEK SMALL LETTER ETA WITH VARIA AND YPOGEGRAMMENI..GREEK SMALL LETTER ETA WITH OXIA AND YPOGEGRAMMENI +1FC6..1FCC ; ALetter # L& [7] GREEK SMALL LETTER ETA WITH PERISPOMENI..GREEK CAPITAL LETTER ETA WITH PROSGEGRAMMENI +1FD0..1FD3 ; ALetter # L& [4] GREEK SMALL LETTER IOTA WITH VRACHY..GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA +1FD6..1FDB ; ALetter # L& [6] GREEK SMALL LETTER IOTA WITH PERISPOMENI..GREEK CAPITAL LETTER IOTA WITH OXIA +1FE0..1FEC ; ALetter # L& [13] GREEK SMALL LETTER UPSILON WITH VRACHY..GREEK CAPITAL LETTER RHO WITH DASIA +1FF2..1FF4 ; ALetter # L& [3] GREEK SMALL LETTER OMEGA WITH VARIA AND YPOGEGRAMMENI..GREEK SMALL LETTER OMEGA WITH OXIA AND YPOGEGRAMMENI +1FF6..1FFC ; ALetter # L& [7] GREEK SMALL LETTER OMEGA WITH PERISPOMENI..GREEK CAPITAL LETTER OMEGA WITH PROSGEGRAMMENI +2071 ; ALetter # Lm SUPERSCRIPT LATIN SMALL LETTER I +207F ; ALetter # Lm SUPERSCRIPT LATIN SMALL LETTER N +2090..209C ; ALetter # Lm [13] LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER T +2102 ; ALetter # L& DOUBLE-STRUCK CAPITAL C +2107 ; ALetter # L& EULER CONSTANT +210A..2113 ; ALetter # L& [10] SCRIPT SMALL G..SCRIPT SMALL L +2115 ; ALetter # L& DOUBLE-STRUCK CAPITAL N +2119..211D ; ALetter # L& [5] DOUBLE-STRUCK CAPITAL P..DOUBLE-STRUCK CAPITAL R +2124 ; ALetter # L& DOUBLE-STRUCK CAPITAL Z +2126 ; ALetter # L& OHM SIGN +2128 ; ALetter # L& BLACK-LETTER CAPITAL Z +212A..212D ; ALetter # L& [4] KELVIN SIGN..BLACK-LETTER CAPITAL C +212F..2134 ; ALetter # L& [6] SCRIPT SMALL E..SCRIPT SMALL O +2135..2138 ; ALetter # Lo [4] ALEF SYMBOL..DALET SYMBOL +2139 ; ALetter # L& INFORMATION SOURCE +213C..213F ; ALetter # L& [4] DOUBLE-STRUCK SMALL PI..DOUBLE-STRUCK CAPITAL PI +2145..2149 ; ALetter # L& [5] DOUBLE-STRUCK ITALIC CAPITAL D..DOUBLE-STRUCK ITALIC SMALL J +214E ; ALetter # L& TURNED SMALL F +2160..2182 ; ALetter # Nl [35] ROMAN NUMERAL ONE..ROMAN NUMERAL TEN THOUSAND +2183..2184 ; ALetter # L& [2] ROMAN NUMERAL REVERSED ONE HUNDRED..LATIN SMALL LETTER REVERSED C +2185..2188 ; ALetter # Nl [4] ROMAN NUMERAL SIX LATE FORM..ROMAN NUMERAL ONE HUNDRED THOUSAND +24B6..24E9 ; ALetter # So [52] CIRCLED LATIN CAPITAL LETTER A..CIRCLED LATIN SMALL LETTER Z +2C00..2C7B ; ALetter # L& [124] GLAGOLITIC CAPITAL LETTER AZU..LATIN LETTER SMALL CAPITAL TURNED E +2C7C..2C7D ; ALetter # Lm [2] LATIN SUBSCRIPT SMALL LETTER J..MODIFIER LETTER CAPITAL V +2C7E..2CE4 ; ALetter # L& [103] LATIN CAPITAL LETTER S WITH SWASH TAIL..COPTIC SYMBOL KAI +2CEB..2CEE ; ALetter # L& [4] COPTIC CAPITAL LETTER CRYPTOGRAMMIC SHEI..COPTIC SMALL LETTER CRYPTOGRAMMIC GANGIA +2CF2..2CF3 ; ALetter # L& [2] COPTIC CAPITAL LETTER BOHAIRIC KHEI..COPTIC SMALL LETTER BOHAIRIC KHEI +2D00..2D25 ; ALetter # L& [38] GEORGIAN SMALL LETTER AN..GEORGIAN SMALL LETTER HOE +2D27 ; ALetter # L& GEORGIAN SMALL LETTER YN +2D2D ; ALetter # L& GEORGIAN SMALL LETTER AEN +2D30..2D67 ; ALetter # Lo [56] TIFINAGH LETTER YA..TIFINAGH LETTER YO +2D6F ; ALetter # Lm TIFINAGH MODIFIER LETTER LABIALIZATION MARK +2D80..2D96 ; ALetter # Lo [23] ETHIOPIC SYLLABLE LOA..ETHIOPIC SYLLABLE GGWE +2DA0..2DA6 ; ALetter # Lo [7] ETHIOPIC SYLLABLE SSA..ETHIOPIC SYLLABLE SSO +2DA8..2DAE ; ALetter # Lo [7] ETHIOPIC SYLLABLE CCA..ETHIOPIC SYLLABLE CCO +2DB0..2DB6 ; ALetter # Lo [7] ETHIOPIC SYLLABLE ZZA..ETHIOPIC SYLLABLE ZZO +2DB8..2DBE ; ALetter # Lo [7] ETHIOPIC SYLLABLE CCHA..ETHIOPIC SYLLABLE CCHO +2DC0..2DC6 ; ALetter # Lo [7] ETHIOPIC SYLLABLE QYA..ETHIOPIC SYLLABLE QYO +2DC8..2DCE ; ALetter # Lo [7] ETHIOPIC SYLLABLE KYA..ETHIOPIC SYLLABLE KYO +2DD0..2DD6 ; ALetter # Lo [7] ETHIOPIC SYLLABLE XYA..ETHIOPIC SYLLABLE XYO +2DD8..2DDE ; ALetter # Lo [7] ETHIOPIC SYLLABLE GYA..ETHIOPIC SYLLABLE GYO +2E2F ; ALetter # Lm VERTICAL TILDE +3005 ; ALetter # Lm IDEOGRAPHIC ITERATION MARK +303B ; ALetter # Lm VERTICAL IDEOGRAPHIC ITERATION MARK +303C ; ALetter # Lo MASU MARK +3105..312F ; ALetter # Lo [43] BOPOMOFO LETTER B..BOPOMOFO LETTER NN +3131..318E ; ALetter # Lo [94] HANGUL LETTER KIYEOK..HANGUL LETTER ARAEAE +31A0..31BF ; ALetter # Lo [32] BOPOMOFO LETTER BU..BOPOMOFO LETTER AH +A000..A014 ; ALetter # Lo [21] YI SYLLABLE IT..YI SYLLABLE E +A015 ; ALetter # Lm YI SYLLABLE WU +A016..A48C ; ALetter # Lo [1143] YI SYLLABLE BIT..YI SYLLABLE YYR +A4D0..A4F7 ; ALetter # Lo [40] LISU LETTER BA..LISU LETTER OE +A4F8..A4FD ; ALetter # Lm [6] LISU LETTER TONE MYA TI..LISU LETTER TONE MYA JEU +A500..A60B ; ALetter # Lo [268] VAI SYLLABLE EE..VAI SYLLABLE NG +A60C ; ALetter # Lm VAI SYLLABLE LENGTHENER +A610..A61F ; ALetter # Lo [16] VAI SYLLABLE NDOLE FA..VAI SYMBOL JONG +A62A..A62B ; ALetter # Lo [2] VAI SYLLABLE NDOLE MA..VAI SYLLABLE NDOLE DO +A640..A66D ; ALetter # L& [46] CYRILLIC CAPITAL LETTER ZEMLYA..CYRILLIC SMALL LETTER DOUBLE MONOCULAR O +A66E ; ALetter # Lo CYRILLIC LETTER MULTIOCULAR O +A67F ; ALetter # Lm CYRILLIC PAYEROK +A680..A69B ; ALetter # L& [28] CYRILLIC CAPITAL LETTER DWE..CYRILLIC SMALL LETTER CROSSED O +A69C..A69D ; ALetter # Lm [2] MODIFIER LETTER CYRILLIC HARD SIGN..MODIFIER LETTER CYRILLIC SOFT SIGN +A6A0..A6E5 ; ALetter # Lo [70] BAMUM LETTER A..BAMUM LETTER KI +A6E6..A6EF ; ALetter # Nl [10] BAMUM LETTER MO..BAMUM LETTER KOGHOM +A708..A716 ; ALetter # Sk [15] MODIFIER LETTER EXTRA-HIGH DOTTED TONE BAR..MODIFIER LETTER EXTRA-LOW LEFT-STEM TONE BAR +A717..A71F ; ALetter # Lm [9] MODIFIER LETTER DOT VERTICAL BAR..MODIFIER LETTER LOW INVERTED EXCLAMATION MARK +A720..A721 ; ALetter # Sk [2] MODIFIER LETTER STRESS AND HIGH TONE..MODIFIER LETTER STRESS AND LOW TONE +A722..A76F ; ALetter # L& [78] LATIN CAPITAL LETTER EGYPTOLOGICAL ALEF..LATIN SMALL LETTER CON +A770 ; ALetter # Lm MODIFIER LETTER US +A771..A787 ; ALetter # L& [23] LATIN SMALL LETTER DUM..LATIN SMALL LETTER INSULAR T +A788 ; ALetter # Lm MODIFIER LETTER LOW CIRCUMFLEX ACCENT +A789..A78A ; ALetter # Sk [2] MODIFIER LETTER COLON..MODIFIER LETTER SHORT EQUALS SIGN +A78B..A78E ; ALetter # L& [4] LATIN CAPITAL LETTER SALTILLO..LATIN SMALL LETTER L WITH RETROFLEX HOOK AND BELT +A78F ; ALetter # Lo LATIN LETTER SINOLOGICAL DOT +A790..A7DC ; ALetter # L& [77] LATIN CAPITAL LETTER N WITH DESCENDER..LATIN CAPITAL LETTER LAMBDA WITH STROKE +A7F1..A7F4 ; ALetter # Lm [4] MODIFIER LETTER CAPITAL S..MODIFIER LETTER CAPITAL Q +A7F5..A7F6 ; ALetter # L& [2] LATIN CAPITAL LETTER REVERSED HALF H..LATIN SMALL LETTER REVERSED HALF H +A7F7 ; ALetter # Lo LATIN EPIGRAPHIC LETTER SIDEWAYS I +A7F8..A7F9 ; ALetter # Lm [2] MODIFIER LETTER CAPITAL H WITH STROKE..MODIFIER LETTER SMALL LIGATURE OE +A7FA ; ALetter # L& LATIN LETTER SMALL CAPITAL TURNED M +A7FB..A801 ; ALetter # Lo [7] LATIN EPIGRAPHIC LETTER REVERSED F..SYLOTI NAGRI LETTER I +A803..A805 ; ALetter # Lo [3] SYLOTI NAGRI LETTER U..SYLOTI NAGRI LETTER O +A807..A80A ; ALetter # Lo [4] SYLOTI NAGRI LETTER KO..SYLOTI NAGRI LETTER GHO +A80C..A822 ; ALetter # Lo [23] SYLOTI NAGRI LETTER CO..SYLOTI NAGRI LETTER HO +A840..A873 ; ALetter # Lo [52] PHAGS-PA LETTER KA..PHAGS-PA LETTER CANDRABINDU +A882..A8B3 ; ALetter # Lo [50] SAURASHTRA LETTER A..SAURASHTRA LETTER LLA +A8F2..A8F7 ; ALetter # Lo [6] DEVANAGARI SIGN SPACING CANDRABINDU..DEVANAGARI SIGN CANDRABINDU AVAGRAHA +A8FB ; ALetter # Lo DEVANAGARI HEADSTROKE +A8FD..A8FE ; ALetter # Lo [2] DEVANAGARI JAIN OM..DEVANAGARI LETTER AY +A90A..A925 ; ALetter # Lo [28] KAYAH LI LETTER KA..KAYAH LI LETTER OO +A930..A946 ; ALetter # Lo [23] REJANG LETTER KA..REJANG LETTER A +A960..A97C ; ALetter # Lo [29] HANGUL CHOSEONG TIKEUT-MIEUM..HANGUL CHOSEONG SSANGYEORINHIEUH +A984..A9B2 ; ALetter # Lo [47] JAVANESE LETTER A..JAVANESE LETTER HA +A9CF ; ALetter # Lm JAVANESE PANGRANGKEP +AA00..AA28 ; ALetter # Lo [41] CHAM LETTER A..CHAM LETTER HA +AA40..AA42 ; ALetter # Lo [3] CHAM LETTER FINAL K..CHAM LETTER FINAL NG +AA44..AA4B ; ALetter # Lo [8] CHAM LETTER FINAL CH..CHAM LETTER FINAL SS +AAE0..AAEA ; ALetter # Lo [11] MEETEI MAYEK LETTER E..MEETEI MAYEK LETTER SSA +AAF2 ; ALetter # Lo MEETEI MAYEK ANJI +AAF3..AAF4 ; ALetter # Lm [2] MEETEI MAYEK SYLLABLE REPETITION MARK..MEETEI MAYEK WORD REPETITION MARK +AB01..AB06 ; ALetter # Lo [6] ETHIOPIC SYLLABLE TTHU..ETHIOPIC SYLLABLE TTHO +AB09..AB0E ; ALetter # Lo [6] ETHIOPIC SYLLABLE DDHU..ETHIOPIC SYLLABLE DDHO +AB11..AB16 ; ALetter # Lo [6] ETHIOPIC SYLLABLE DZU..ETHIOPIC SYLLABLE DZO +AB20..AB26 ; ALetter # Lo [7] ETHIOPIC SYLLABLE CCHHA..ETHIOPIC SYLLABLE CCHHO +AB28..AB2E ; ALetter # Lo [7] ETHIOPIC SYLLABLE BBA..ETHIOPIC SYLLABLE BBO +AB30..AB5A ; ALetter # L& [43] LATIN SMALL LETTER BARRED ALPHA..LATIN SMALL LETTER Y WITH SHORT RIGHT LEG +AB5B ; ALetter # Sk MODIFIER BREVE WITH INVERTED BREVE +AB5C..AB5F ; ALetter # Lm [4] MODIFIER LETTER SMALL HENG..MODIFIER LETTER SMALL U WITH LEFT HOOK +AB60..AB68 ; ALetter # L& [9] LATIN SMALL LETTER SAKHA YAT..LATIN SMALL LETTER TURNED R WITH MIDDLE TILDE +AB69 ; ALetter # Lm MODIFIER LETTER SMALL TURNED W +AB70..ABBF ; ALetter # L& [80] CHEROKEE SMALL LETTER A..CHEROKEE SMALL LETTER YA +ABC0..ABE2 ; ALetter # Lo [35] MEETEI MAYEK LETTER KOK..MEETEI MAYEK LETTER I LONSUM +AC00..D7A3 ; ALetter # Lo [11172] HANGUL SYLLABLE GA..HANGUL SYLLABLE HIH +D7B0..D7C6 ; ALetter # Lo [23] HANGUL JUNGSEONG O-YEO..HANGUL JUNGSEONG ARAEA-E +D7CB..D7FB ; ALetter # Lo [49] HANGUL JONGSEONG NIEUN-RIEUL..HANGUL JONGSEONG PHIEUPH-THIEUTH +FB00..FB06 ; ALetter # L& [7] LATIN SMALL LIGATURE FF..LATIN SMALL LIGATURE ST +FB13..FB17 ; ALetter # L& [5] ARMENIAN SMALL LIGATURE MEN NOW..ARMENIAN SMALL LIGATURE MEN XEH +FB50..FBB1 ; ALetter # Lo [98] ARABIC LETTER ALEF WASLA ISOLATED FORM..ARABIC LETTER YEH BARREE WITH HAMZA ABOVE FINAL FORM +FBD3..FD3D ; ALetter # Lo [363] ARABIC LETTER NG ISOLATED FORM..ARABIC LIGATURE ALEF WITH FATHATAN ISOLATED FORM +FD50..FD8F ; ALetter # Lo [64] ARABIC LIGATURE TEH WITH JEEM WITH MEEM INITIAL FORM..ARABIC LIGATURE MEEM WITH KHAH WITH MEEM INITIAL FORM +FD92..FDC7 ; ALetter # Lo [54] ARABIC LIGATURE MEEM WITH JEEM WITH KHAH INITIAL FORM..ARABIC LIGATURE NOON WITH JEEM WITH YEH FINAL FORM +FDF0..FDFB ; ALetter # Lo [12] ARABIC LIGATURE SALLA USED AS KORANIC STOP SIGN ISOLATED FORM..ARABIC LIGATURE JALLAJALALOUHOU +FE70..FE74 ; ALetter # Lo [5] ARABIC FATHATAN ISOLATED FORM..ARABIC KASRATAN ISOLATED FORM +FE76..FEFC ; ALetter # Lo [135] ARABIC FATHA ISOLATED FORM..ARABIC LIGATURE LAM WITH ALEF FINAL FORM +FF21..FF3A ; ALetter # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LATIN CAPITAL LETTER Z +FF41..FF5A ; ALetter # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN SMALL LETTER Z +FFA0..FFBE ; ALetter # Lo [31] HALFWIDTH HANGUL FILLER..HALFWIDTH HANGUL LETTER HIEUH +FFC2..FFC7 ; ALetter # Lo [6] HALFWIDTH HANGUL LETTER A..HALFWIDTH HANGUL LETTER E +FFCA..FFCF ; ALetter # Lo [6] HALFWIDTH HANGUL LETTER YEO..HALFWIDTH HANGUL LETTER OE +FFD2..FFD7 ; ALetter # Lo [6] HALFWIDTH HANGUL LETTER YO..HALFWIDTH HANGUL LETTER YU +FFDA..FFDC ; ALetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I +10000..1000B ; ALetter # Lo [12] LINEAR B SYLLABLE B008 A..LINEAR B SYLLABLE B046 JE +1000D..10026 ; ALetter # Lo [26] LINEAR B SYLLABLE B036 JO..LINEAR B SYLLABLE B032 QO +10028..1003A ; ALetter # Lo [19] LINEAR B SYLLABLE B060 RA..LINEAR B SYLLABLE B042 WO +1003C..1003D ; ALetter # Lo [2] LINEAR B SYLLABLE B017 ZA..LINEAR B SYLLABLE B074 ZE +1003F..1004D ; ALetter # Lo [15] LINEAR B SYLLABLE B020 ZO..LINEAR B SYLLABLE B091 TWO +10050..1005D ; ALetter # Lo [14] LINEAR B SYMBOL B018..LINEAR B SYMBOL B089 +10080..100FA ; ALetter # Lo [123] LINEAR B IDEOGRAM B100 MAN..LINEAR B IDEOGRAM VESSEL B305 +10140..10174 ; ALetter # Nl [53] GREEK ACROPHONIC ATTIC ONE QUARTER..GREEK ACROPHONIC STRATIAN FIFTY MNAS +10280..1029C ; ALetter # Lo [29] LYCIAN LETTER A..LYCIAN LETTER X +102A0..102D0 ; ALetter # Lo [49] CARIAN LETTER A..CARIAN LETTER UUU3 +10300..1031F ; ALetter # Lo [32] OLD ITALIC LETTER A..OLD ITALIC LETTER ESS +1032D..10340 ; ALetter # Lo [20] OLD ITALIC LETTER YE..GOTHIC LETTER PAIRTHRA +10341 ; ALetter # Nl GOTHIC LETTER NINETY +10342..10349 ; ALetter # Lo [8] GOTHIC LETTER RAIDA..GOTHIC LETTER OTHAL +1034A ; ALetter # Nl GOTHIC LETTER NINE HUNDRED +10350..10375 ; ALetter # Lo [38] OLD PERMIC LETTER AN..OLD PERMIC LETTER IA +10380..1039D ; ALetter # Lo [30] UGARITIC LETTER ALPA..UGARITIC LETTER SSU +103A0..103C3 ; ALetter # Lo [36] OLD PERSIAN SIGN A..OLD PERSIAN SIGN HA +103C8..103CF ; ALetter # Lo [8] OLD PERSIAN SIGN AURAMAZDAA..OLD PERSIAN SIGN BUUMISH +103D1..103D5 ; ALetter # Nl [5] OLD PERSIAN NUMBER ONE..OLD PERSIAN NUMBER HUNDRED +10400..1044F ; ALetter # L& [80] DESERET CAPITAL LETTER LONG I..DESERET SMALL LETTER EW +10450..1049D ; ALetter # Lo [78] SHAVIAN LETTER PEEP..OSMANYA LETTER OO +104B0..104D3 ; ALetter # L& [36] OSAGE CAPITAL LETTER A..OSAGE CAPITAL LETTER ZHA +104D8..104FB ; ALetter # L& [36] OSAGE SMALL LETTER A..OSAGE SMALL LETTER ZHA +10500..10527 ; ALetter # Lo [40] ELBASAN LETTER A..ELBASAN LETTER KHE +10530..10563 ; ALetter # Lo [52] CAUCASIAN ALBANIAN LETTER ALT..CAUCASIAN ALBANIAN LETTER KIW +10570..1057A ; ALetter # L& [11] VITHKUQI CAPITAL LETTER A..VITHKUQI CAPITAL LETTER GA +1057C..1058A ; ALetter # L& [15] VITHKUQI CAPITAL LETTER HA..VITHKUQI CAPITAL LETTER RE +1058C..10592 ; ALetter # L& [7] VITHKUQI CAPITAL LETTER SE..VITHKUQI CAPITAL LETTER XE +10594..10595 ; ALetter # L& [2] VITHKUQI CAPITAL LETTER Y..VITHKUQI CAPITAL LETTER ZE +10597..105A1 ; ALetter # L& [11] VITHKUQI SMALL LETTER A..VITHKUQI SMALL LETTER GA +105A3..105B1 ; ALetter # L& [15] VITHKUQI SMALL LETTER HA..VITHKUQI SMALL LETTER RE +105B3..105B9 ; ALetter # L& [7] VITHKUQI SMALL LETTER SE..VITHKUQI SMALL LETTER XE +105BB..105BC ; ALetter # L& [2] VITHKUQI SMALL LETTER Y..VITHKUQI SMALL LETTER ZE +105C0..105F3 ; ALetter # Lo [52] TODHRI LETTER A..TODHRI LETTER OO +10600..10736 ; ALetter # Lo [311] LINEAR A SIGN AB001..LINEAR A SIGN A664 +10740..10755 ; ALetter # Lo [22] LINEAR A SIGN A701 A..LINEAR A SIGN A732 JE +10760..10767 ; ALetter # Lo [8] LINEAR A SIGN A800..LINEAR A SIGN A807 +10780..10785 ; ALetter # Lm [6] MODIFIER LETTER SMALL CAPITAL AA..MODIFIER LETTER SMALL B WITH HOOK +10787..107B0 ; ALetter # Lm [42] MODIFIER LETTER SMALL DZ DIGRAPH..MODIFIER LETTER SMALL V WITH RIGHT HOOK +107B2..107BA ; ALetter # Lm [9] MODIFIER LETTER SMALL CAPITAL Y..MODIFIER LETTER SMALL S WITH CURL +10800..10805 ; ALetter # Lo [6] CYPRIOT SYLLABLE A..CYPRIOT SYLLABLE JA +10808 ; ALetter # Lo CYPRIOT SYLLABLE JO +1080A..10835 ; ALetter # Lo [44] CYPRIOT SYLLABLE KA..CYPRIOT SYLLABLE WO +10837..10838 ; ALetter # Lo [2] CYPRIOT SYLLABLE XA..CYPRIOT SYLLABLE XE +1083C ; ALetter # Lo CYPRIOT SYLLABLE ZA +1083F..10855 ; ALetter # Lo [23] CYPRIOT SYLLABLE ZO..IMPERIAL ARAMAIC LETTER TAW +10860..10876 ; ALetter # Lo [23] PALMYRENE LETTER ALEPH..PALMYRENE LETTER TAW +10880..1089E ; ALetter # Lo [31] NABATAEAN LETTER FINAL ALEPH..NABATAEAN LETTER TAW +108E0..108F2 ; ALetter # Lo [19] HATRAN LETTER ALEPH..HATRAN LETTER QOPH +108F4..108F5 ; ALetter # Lo [2] HATRAN LETTER SHIN..HATRAN LETTER TAW +10900..10915 ; ALetter # Lo [22] PHOENICIAN LETTER ALF..PHOENICIAN LETTER TAU +10920..10939 ; ALetter # Lo [26] LYDIAN LETTER A..LYDIAN LETTER C +10940..10959 ; ALetter # Lo [26] SIDETIC LETTER N01..SIDETIC LETTER N26 +10980..109B7 ; ALetter # Lo [56] MEROITIC HIEROGLYPHIC LETTER A..MEROITIC CURSIVE LETTER DA +109BE..109BF ; ALetter # Lo [2] MEROITIC CURSIVE LOGOGRAM RMT..MEROITIC CURSIVE LOGOGRAM IMN +10A00 ; ALetter # Lo KHAROSHTHI LETTER A +10A10..10A13 ; ALetter # Lo [4] KHAROSHTHI LETTER KA..KHAROSHTHI LETTER GHA +10A15..10A17 ; ALetter # Lo [3] KHAROSHTHI LETTER CA..KHAROSHTHI LETTER JA +10A19..10A35 ; ALetter # Lo [29] KHAROSHTHI LETTER NYA..KHAROSHTHI LETTER VHA +10A60..10A7C ; ALetter # Lo [29] OLD SOUTH ARABIAN LETTER HE..OLD SOUTH ARABIAN LETTER THETH +10A80..10A9C ; ALetter # Lo [29] OLD NORTH ARABIAN LETTER HEH..OLD NORTH ARABIAN LETTER ZAH +10AC0..10AC7 ; ALetter # Lo [8] MANICHAEAN LETTER ALEPH..MANICHAEAN LETTER WAW +10AC9..10AE4 ; ALetter # Lo [28] MANICHAEAN LETTER ZAYIN..MANICHAEAN LETTER TAW +10B00..10B35 ; ALetter # Lo [54] AVESTAN LETTER A..AVESTAN LETTER HE +10B40..10B55 ; ALetter # Lo [22] INSCRIPTIONAL PARTHIAN LETTER ALEPH..INSCRIPTIONAL PARTHIAN LETTER TAW +10B60..10B72 ; ALetter # Lo [19] INSCRIPTIONAL PAHLAVI LETTER ALEPH..INSCRIPTIONAL PAHLAVI LETTER TAW +10B80..10B91 ; ALetter # Lo [18] PSALTER PAHLAVI LETTER ALEPH..PSALTER PAHLAVI LETTER TAW +10C00..10C48 ; ALetter # Lo [73] OLD TURKIC LETTER ORKHON A..OLD TURKIC LETTER ORKHON BASH +10C80..10CB2 ; ALetter # L& [51] OLD HUNGARIAN CAPITAL LETTER A..OLD HUNGARIAN CAPITAL LETTER US +10CC0..10CF2 ; ALetter # L& [51] OLD HUNGARIAN SMALL LETTER A..OLD HUNGARIAN SMALL LETTER US +10D00..10D23 ; ALetter # Lo [36] HANIFI ROHINGYA LETTER A..HANIFI ROHINGYA MARK NA KHONNA +10D4A..10D4D ; ALetter # Lo [4] GARAY VOWEL SIGN A..GARAY VOWEL SIGN EE +10D4E ; ALetter # Lm GARAY VOWEL LENGTH MARK +10D4F ; ALetter # Lo GARAY SUKUN +10D50..10D65 ; ALetter # L& [22] GARAY CAPITAL LETTER A..GARAY CAPITAL LETTER OLD NA +10D6F ; ALetter # Lm GARAY REDUPLICATION MARK +10D70..10D85 ; ALetter # L& [22] GARAY SMALL LETTER A..GARAY SMALL LETTER OLD NA +10E80..10EA9 ; ALetter # Lo [42] YEZIDI LETTER ELIF..YEZIDI LETTER ET +10EB0..10EB1 ; ALetter # Lo [2] YEZIDI LETTER LAM WITH DOT ABOVE..YEZIDI LETTER YOT WITH CIRCUMFLEX ABOVE +10EC2..10EC4 ; ALetter # Lo [3] ARABIC LETTER DAL WITH TWO DOTS VERTICALLY BELOW..ARABIC LETTER KAF WITH TWO DOTS VERTICALLY BELOW +10EC5 ; ALetter # Lm ARABIC SMALL YEH BARREE WITH TWO DOTS BELOW +10EC6..10EC7 ; ALetter # Lo [2] ARABIC LETTER THIN NOON..ARABIC LETTER YEH WITH FOUR DOTS BELOW +10F00..10F1C ; ALetter # Lo [29] OLD SOGDIAN LETTER ALEPH..OLD SOGDIAN LETTER FINAL TAW WITH VERTICAL TAIL +10F27 ; ALetter # Lo OLD SOGDIAN LIGATURE AYIN-DALETH +10F30..10F45 ; ALetter # Lo [22] SOGDIAN LETTER ALEPH..SOGDIAN INDEPENDENT SHIN +10F70..10F81 ; ALetter # Lo [18] OLD UYGHUR LETTER ALEPH..OLD UYGHUR LETTER LESH +10FB0..10FC4 ; ALetter # Lo [21] CHORASMIAN LETTER ALEPH..CHORASMIAN LETTER TAW +10FE0..10FF6 ; ALetter # Lo [23] ELYMAIC LETTER ALEPH..ELYMAIC LIGATURE ZAYIN-YODH +11003..11037 ; ALetter # Lo [53] BRAHMI SIGN JIHVAMULIYA..BRAHMI LETTER OLD TAMIL NNNA +11071..11072 ; ALetter # Lo [2] BRAHMI LETTER OLD TAMIL SHORT E..BRAHMI LETTER OLD TAMIL SHORT O +11075 ; ALetter # Lo BRAHMI LETTER OLD TAMIL LLA +11083..110AF ; ALetter # Lo [45] KAITHI LETTER A..KAITHI LETTER HA +110D0..110E8 ; ALetter # Lo [25] SORA SOMPENG LETTER SAH..SORA SOMPENG LETTER MAE +11103..11126 ; ALetter # Lo [36] CHAKMA LETTER AA..CHAKMA LETTER HAA +11144 ; ALetter # Lo CHAKMA LETTER LHAA +11147 ; ALetter # Lo CHAKMA LETTER VAA +11150..11172 ; ALetter # Lo [35] MAHAJANI LETTER A..MAHAJANI LETTER RRA +11176 ; ALetter # Lo MAHAJANI LIGATURE SHRI +11183..111B2 ; ALetter # Lo [48] SHARADA LETTER A..SHARADA LETTER HA +111C1..111C4 ; ALetter # Lo [4] SHARADA SIGN AVAGRAHA..SHARADA OM +111DA ; ALetter # Lo SHARADA EKAM +111DC ; ALetter # Lo SHARADA HEADSTROKE +11200..11211 ; ALetter # Lo [18] KHOJKI LETTER A..KHOJKI LETTER JJA +11213..1122B ; ALetter # Lo [25] KHOJKI LETTER NYA..KHOJKI LETTER LLA +1123F..11240 ; ALetter # Lo [2] KHOJKI LETTER QA..KHOJKI LETTER SHORT I +11280..11286 ; ALetter # Lo [7] MULTANI LETTER A..MULTANI LETTER GA +11288 ; ALetter # Lo MULTANI LETTER GHA +1128A..1128D ; ALetter # Lo [4] MULTANI LETTER CA..MULTANI LETTER JJA +1128F..1129D ; ALetter # Lo [15] MULTANI LETTER NYA..MULTANI LETTER BA +1129F..112A8 ; ALetter # Lo [10] MULTANI LETTER BHA..MULTANI LETTER RHA +112B0..112DE ; ALetter # Lo [47] KHUDAWADI LETTER A..KHUDAWADI LETTER HA +11305..1130C ; ALetter # Lo [8] GRANTHA LETTER A..GRANTHA LETTER VOCALIC L +1130F..11310 ; ALetter # Lo [2] GRANTHA LETTER EE..GRANTHA LETTER AI +11313..11328 ; ALetter # Lo [22] GRANTHA LETTER OO..GRANTHA LETTER NA +1132A..11330 ; ALetter # Lo [7] GRANTHA LETTER PA..GRANTHA LETTER RA +11332..11333 ; ALetter # Lo [2] GRANTHA LETTER LA..GRANTHA LETTER LLA +11335..11339 ; ALetter # Lo [5] GRANTHA LETTER VA..GRANTHA LETTER HA +1133D ; ALetter # Lo GRANTHA SIGN AVAGRAHA +11350 ; ALetter # Lo GRANTHA OM +1135D..11361 ; ALetter # Lo [5] GRANTHA SIGN PLUTA..GRANTHA LETTER VOCALIC LL +11380..11389 ; ALetter # Lo [10] TULU-TIGALARI LETTER A..TULU-TIGALARI LETTER VOCALIC LL +1138B ; ALetter # Lo TULU-TIGALARI LETTER EE +1138E ; ALetter # Lo TULU-TIGALARI LETTER AI +11390..113B5 ; ALetter # Lo [38] TULU-TIGALARI LETTER OO..TULU-TIGALARI LETTER LLLA +113B7 ; ALetter # Lo TULU-TIGALARI SIGN AVAGRAHA +113D1 ; ALetter # Lo TULU-TIGALARI REPHA +113D3 ; ALetter # Lo TULU-TIGALARI SIGN PLUTA +11400..11434 ; ALetter # Lo [53] NEWA LETTER A..NEWA LETTER HA +11447..1144A ; ALetter # Lo [4] NEWA SIGN AVAGRAHA..NEWA SIDDHI +1145F..11461 ; ALetter # Lo [3] NEWA LETTER VEDIC ANUSVARA..NEWA SIGN UPADHMANIYA +11480..114AF ; ALetter # Lo [48] TIRHUTA ANJI..TIRHUTA LETTER HA +114C4..114C5 ; ALetter # Lo [2] TIRHUTA SIGN AVAGRAHA..TIRHUTA GVANG +114C7 ; ALetter # Lo TIRHUTA OM +11580..115AE ; ALetter # Lo [47] SIDDHAM LETTER A..SIDDHAM LETTER HA +115D8..115DB ; ALetter # Lo [4] SIDDHAM LETTER THREE-CIRCLE ALTERNATE I..SIDDHAM LETTER ALTERNATE U +11600..1162F ; ALetter # Lo [48] MODI LETTER A..MODI LETTER LLA +11644 ; ALetter # Lo MODI SIGN HUVA +11680..116AA ; ALetter # Lo [43] TAKRI LETTER A..TAKRI LETTER RRA +116B8 ; ALetter # Lo TAKRI LETTER ARCHAIC KHA +11800..1182B ; ALetter # Lo [44] DOGRA LETTER A..DOGRA LETTER RRA +118A0..118DF ; ALetter # L& [64] WARANG CITI CAPITAL LETTER NGAA..WARANG CITI SMALL LETTER VIYO +118FF..11906 ; ALetter # Lo [8] WARANG CITI OM..DIVES AKURU LETTER E +11909 ; ALetter # Lo DIVES AKURU LETTER O +1190C..11913 ; ALetter # Lo [8] DIVES AKURU LETTER KA..DIVES AKURU LETTER JA +11915..11916 ; ALetter # Lo [2] DIVES AKURU LETTER NYA..DIVES AKURU LETTER TTA +11918..1192F ; ALetter # Lo [24] DIVES AKURU LETTER DDA..DIVES AKURU LETTER ZA +1193F ; ALetter # Lo DIVES AKURU PREFIXED NASAL SIGN +11941 ; ALetter # Lo DIVES AKURU INITIAL RA +119A0..119A7 ; ALetter # Lo [8] NANDINAGARI LETTER A..NANDINAGARI LETTER VOCALIC RR +119AA..119D0 ; ALetter # Lo [39] NANDINAGARI LETTER E..NANDINAGARI LETTER RRA +119E1 ; ALetter # Lo NANDINAGARI SIGN AVAGRAHA +119E3 ; ALetter # Lo NANDINAGARI HEADSTROKE +11A00 ; ALetter # Lo ZANABAZAR SQUARE LETTER A +11A0B..11A32 ; ALetter # Lo [40] ZANABAZAR SQUARE LETTER KA..ZANABAZAR SQUARE LETTER KSSA +11A3A ; ALetter # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA +11A50 ; ALetter # Lo SOYOMBO LETTER A +11A5C..11A89 ; ALetter # Lo [46] SOYOMBO LETTER KA..SOYOMBO CLUSTER-INITIAL LETTER SA +11A9D ; ALetter # Lo SOYOMBO MARK PLUTA +11AB0..11AF8 ; ALetter # Lo [73] CANADIAN SYLLABICS NATTILIK HI..PAU CIN HAU GLOTTAL STOP FINAL +11BC0..11BE0 ; ALetter # Lo [33] SUNUWAR LETTER DEVI..SUNUWAR LETTER KLOKO +11C00..11C08 ; ALetter # Lo [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L +11C0A..11C2E ; ALetter # Lo [37] BHAIKSUKI LETTER E..BHAIKSUKI LETTER HA +11C40 ; ALetter # Lo BHAIKSUKI SIGN AVAGRAHA +11C72..11C8F ; ALetter # Lo [30] MARCHEN LETTER KA..MARCHEN LETTER A +11D00..11D06 ; ALetter # Lo [7] MASARAM GONDI LETTER A..MASARAM GONDI LETTER E +11D08..11D09 ; ALetter # Lo [2] MASARAM GONDI LETTER AI..MASARAM GONDI LETTER O +11D0B..11D30 ; ALetter # Lo [38] MASARAM GONDI LETTER AU..MASARAM GONDI LETTER TRA +11D46 ; ALetter # Lo MASARAM GONDI REPHA +11D60..11D65 ; ALetter # Lo [6] GUNJALA GONDI LETTER A..GUNJALA GONDI LETTER UU +11D67..11D68 ; ALetter # Lo [2] GUNJALA GONDI LETTER EE..GUNJALA GONDI LETTER AI +11D6A..11D89 ; ALetter # Lo [32] GUNJALA GONDI LETTER OO..GUNJALA GONDI LETTER SA +11D98 ; ALetter # Lo GUNJALA GONDI OM +11DB0..11DD8 ; ALetter # Lo [41] TOLONG SIKI LETTER I..TOLONG SIKI LETTER RRH +11DD9 ; ALetter # Lm TOLONG SIKI SIGN SELA +11DDA..11DDB ; ALetter # Lo [2] TOLONG SIKI SIGN HECAKA..TOLONG SIKI UNGGA +11EE0..11EF2 ; ALetter # Lo [19] MAKASAR LETTER KA..MAKASAR ANGKA +11F02 ; ALetter # Lo KAWI SIGN REPHA +11F04..11F10 ; ALetter # Lo [13] KAWI LETTER A..KAWI LETTER O +11F12..11F33 ; ALetter # Lo [34] KAWI LETTER KA..KAWI LETTER JNYA +11FB0 ; ALetter # Lo LISU LETTER YHA +12000..12399 ; ALetter # Lo [922] CUNEIFORM SIGN A..CUNEIFORM SIGN U U +12400..1246E ; ALetter # Nl [111] CUNEIFORM NUMERIC SIGN TWO ASH..CUNEIFORM NUMERIC SIGN NINE U VARIANT FORM +12480..12543 ; ALetter # Lo [196] CUNEIFORM SIGN AB TIMES NUN TENU..CUNEIFORM SIGN ZU5 TIMES THREE DISH TENU +12F90..12FF0 ; ALetter # Lo [97] CYPRO-MINOAN SIGN CM001..CYPRO-MINOAN SIGN CM114 +13000..1342F ; ALetter # Lo [1072] EGYPTIAN HIEROGLYPH A001..EGYPTIAN HIEROGLYPH V011D +13441..13446 ; ALetter # Lo [6] EGYPTIAN HIEROGLYPH FULL BLANK..EGYPTIAN HIEROGLYPH WIDE LOST SIGN +13460..143FA ; ALetter # Lo [3995] EGYPTIAN HIEROGLYPH-13460..EGYPTIAN HIEROGLYPH-143FA +14400..14646 ; ALetter # Lo [583] ANATOLIAN HIEROGLYPH A001..ANATOLIAN HIEROGLYPH A530 +16100..1611D ; ALetter # Lo [30] GURUNG KHEMA LETTER A..GURUNG KHEMA LETTER SA +16800..16A38 ; ALetter # Lo [569] BAMUM LETTER PHASE-A NGKUE MFON..BAMUM LETTER PHASE-F VUEQ +16A40..16A5E ; ALetter # Lo [31] MRO LETTER TA..MRO LETTER TEK +16A70..16ABE ; ALetter # Lo [79] TANGSA LETTER OZ..TANGSA LETTER ZA +16AD0..16AED ; ALetter # Lo [30] BASSA VAH LETTER ENNI..BASSA VAH LETTER I +16B00..16B2F ; ALetter # Lo [48] PAHAWH HMONG VOWEL KEEB..PAHAWH HMONG CONSONANT CAU +16B40..16B43 ; ALetter # Lm [4] PAHAWH HMONG SIGN VOS SEEV..PAHAWH HMONG SIGN IB YAM +16B63..16B77 ; ALetter # Lo [21] PAHAWH HMONG SIGN VOS LUB..PAHAWH HMONG SIGN CIM NRES TOS +16B7D..16B8F ; ALetter # Lo [19] PAHAWH HMONG CLAN SIGN TSHEEJ..PAHAWH HMONG CLAN SIGN VWJ +16D40..16D42 ; ALetter # Lm [3] KIRAT RAI SIGN ANUSVARA..KIRAT RAI SIGN VISARGA +16D43..16D6A ; ALetter # Lo [40] KIRAT RAI LETTER A..KIRAT RAI VOWEL SIGN AU +16D6B..16D6C ; ALetter # Lm [2] KIRAT RAI SIGN VIRAMA..KIRAT RAI SIGN SAAT +16E40..16E7F ; ALetter # L& [64] MEDEFAIDRIN CAPITAL LETTER M..MEDEFAIDRIN SMALL LETTER Y +16EA0..16EB8 ; ALetter # L& [25] BERIA ERFE CAPITAL LETTER ARKAB..BERIA ERFE CAPITAL LETTER AY +16EBB..16ED3 ; ALetter # L& [25] BERIA ERFE SMALL LETTER ARKAB..BERIA ERFE SMALL LETTER AY +16F00..16F4A ; ALetter # Lo [75] MIAO LETTER PA..MIAO LETTER RTE +16F50 ; ALetter # Lo MIAO LETTER NASALIZATION +16F93..16F9F ; ALetter # Lm [13] MIAO LETTER TONE-2..MIAO LETTER REFORMED TONE-8 +16FE0..16FE1 ; ALetter # Lm [2] TANGUT ITERATION MARK..NUSHU ITERATION MARK +16FE3 ; ALetter # Lm OLD CHINESE ITERATION MARK +1BC00..1BC6A ; ALetter # Lo [107] DUPLOYAN LETTER H..DUPLOYAN LETTER VOCALIC M +1BC70..1BC7C ; ALetter # Lo [13] DUPLOYAN AFFIX LEFT HORIZONTAL SECANT..DUPLOYAN AFFIX ATTACHED TANGENT HOOK +1BC80..1BC88 ; ALetter # Lo [9] DUPLOYAN AFFIX HIGH ACUTE..DUPLOYAN AFFIX HIGH VERTICAL +1BC90..1BC99 ; ALetter # Lo [10] DUPLOYAN AFFIX LOW ACUTE..DUPLOYAN AFFIX LOW ARROW +1D400..1D454 ; ALetter # L& [85] MATHEMATICAL BOLD CAPITAL A..MATHEMATICAL ITALIC SMALL G +1D456..1D49C ; ALetter # L& [71] MATHEMATICAL ITALIC SMALL I..MATHEMATICAL SCRIPT CAPITAL A +1D49E..1D49F ; ALetter # L& [2] MATHEMATICAL SCRIPT CAPITAL C..MATHEMATICAL SCRIPT CAPITAL D +1D4A2 ; ALetter # L& MATHEMATICAL SCRIPT CAPITAL G +1D4A5..1D4A6 ; ALetter # L& [2] MATHEMATICAL SCRIPT CAPITAL J..MATHEMATICAL SCRIPT CAPITAL K +1D4A9..1D4AC ; ALetter # L& [4] MATHEMATICAL SCRIPT CAPITAL N..MATHEMATICAL SCRIPT CAPITAL Q +1D4AE..1D4B9 ; ALetter # L& [12] MATHEMATICAL SCRIPT CAPITAL S..MATHEMATICAL SCRIPT SMALL D +1D4BB ; ALetter # L& MATHEMATICAL SCRIPT SMALL F +1D4BD..1D4C3 ; ALetter # L& [7] MATHEMATICAL SCRIPT SMALL H..MATHEMATICAL SCRIPT SMALL N +1D4C5..1D505 ; ALetter # L& [65] MATHEMATICAL SCRIPT SMALL P..MATHEMATICAL FRAKTUR CAPITAL B +1D507..1D50A ; ALetter # L& [4] MATHEMATICAL FRAKTUR CAPITAL D..MATHEMATICAL FRAKTUR CAPITAL G +1D50D..1D514 ; ALetter # L& [8] MATHEMATICAL FRAKTUR CAPITAL J..MATHEMATICAL FRAKTUR CAPITAL Q +1D516..1D51C ; ALetter # L& [7] MATHEMATICAL FRAKTUR CAPITAL S..MATHEMATICAL FRAKTUR CAPITAL Y +1D51E..1D539 ; ALetter # L& [28] MATHEMATICAL FRAKTUR SMALL A..MATHEMATICAL DOUBLE-STRUCK CAPITAL B +1D53B..1D53E ; ALetter # L& [4] MATHEMATICAL DOUBLE-STRUCK CAPITAL D..MATHEMATICAL DOUBLE-STRUCK CAPITAL G +1D540..1D544 ; ALetter # L& [5] MATHEMATICAL DOUBLE-STRUCK CAPITAL I..MATHEMATICAL DOUBLE-STRUCK CAPITAL M +1D546 ; ALetter # L& MATHEMATICAL DOUBLE-STRUCK CAPITAL O +1D54A..1D550 ; ALetter # L& [7] MATHEMATICAL DOUBLE-STRUCK CAPITAL S..MATHEMATICAL DOUBLE-STRUCK CAPITAL Y +1D552..1D6A5 ; ALetter # L& [340] MATHEMATICAL DOUBLE-STRUCK SMALL A..MATHEMATICAL ITALIC SMALL DOTLESS J +1D6A8..1D6C0 ; ALetter # L& [25] MATHEMATICAL BOLD CAPITAL ALPHA..MATHEMATICAL BOLD CAPITAL OMEGA +1D6C2..1D6DA ; ALetter # L& [25] MATHEMATICAL BOLD SMALL ALPHA..MATHEMATICAL BOLD SMALL OMEGA +1D6DC..1D6FA ; ALetter # L& [31] MATHEMATICAL BOLD EPSILON SYMBOL..MATHEMATICAL ITALIC CAPITAL OMEGA +1D6FC..1D714 ; ALetter # L& [25] MATHEMATICAL ITALIC SMALL ALPHA..MATHEMATICAL ITALIC SMALL OMEGA +1D716..1D734 ; ALetter # L& [31] MATHEMATICAL ITALIC EPSILON SYMBOL..MATHEMATICAL BOLD ITALIC CAPITAL OMEGA +1D736..1D74E ; ALetter # L& [25] MATHEMATICAL BOLD ITALIC SMALL ALPHA..MATHEMATICAL BOLD ITALIC SMALL OMEGA +1D750..1D76E ; ALetter # L& [31] MATHEMATICAL BOLD ITALIC EPSILON SYMBOL..MATHEMATICAL SANS-SERIF BOLD CAPITAL OMEGA +1D770..1D788 ; ALetter # L& [25] MATHEMATICAL SANS-SERIF BOLD SMALL ALPHA..MATHEMATICAL SANS-SERIF BOLD SMALL OMEGA +1D78A..1D7A8 ; ALetter # L& [31] MATHEMATICAL SANS-SERIF BOLD EPSILON SYMBOL..MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL OMEGA +1D7AA..1D7C2 ; ALetter # L& [25] MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL ALPHA..MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL OMEGA +1D7C4..1D7CB ; ALetter # L& [8] MATHEMATICAL SANS-SERIF BOLD ITALIC EPSILON SYMBOL..MATHEMATICAL BOLD SMALL DIGAMMA +1DF00..1DF09 ; ALetter # L& [10] LATIN SMALL LETTER FENG DIGRAPH WITH TRILL..LATIN SMALL LETTER T WITH HOOK AND RETROFLEX HOOK +1DF0A ; ALetter # Lo LATIN LETTER RETROFLEX CLICK WITH RETROFLEX HOOK +1DF0B..1DF1E ; ALetter # L& [20] LATIN SMALL LETTER ESH WITH DOUBLE BAR..LATIN SMALL LETTER S WITH CURL +1DF25..1DF2A ; ALetter # L& [6] LATIN SMALL LETTER D WITH MID-HEIGHT LEFT HOOK..LATIN SMALL LETTER T WITH MID-HEIGHT LEFT HOOK +1E030..1E06D ; ALetter # Lm [62] MODIFIER LETTER CYRILLIC SMALL A..MODIFIER LETTER CYRILLIC SMALL STRAIGHT U WITH STROKE +1E100..1E12C ; ALetter # Lo [45] NYIAKENG PUACHUE HMONG LETTER MA..NYIAKENG PUACHUE HMONG LETTER W +1E137..1E13D ; ALetter # Lm [7] NYIAKENG PUACHUE HMONG SIGN FOR PERSON..NYIAKENG PUACHUE HMONG SYLLABLE LENGTHENER +1E14E ; ALetter # Lo NYIAKENG PUACHUE HMONG LOGOGRAM NYAJ +1E290..1E2AD ; ALetter # Lo [30] TOTO LETTER PA..TOTO LETTER A +1E2C0..1E2EB ; ALetter # Lo [44] WANCHO LETTER AA..WANCHO LETTER YIH +1E4D0..1E4EA ; ALetter # Lo [27] NAG MUNDARI LETTER O..NAG MUNDARI LETTER ELL +1E4EB ; ALetter # Lm NAG MUNDARI SIGN OJOD +1E5D0..1E5ED ; ALetter # Lo [30] OL ONAL LETTER O..OL ONAL LETTER EG +1E5F0 ; ALetter # Lo OL ONAL SIGN HODDOND +1E6C0..1E6DE ; ALetter # Lo [31] TAI YO LETTER LOW KO..TAI YO LETTER HIGH KVO +1E6E0..1E6E2 ; ALetter # Lo [3] TAI YO LETTER AA..TAI YO LETTER UE +1E6E4..1E6E5 ; ALetter # Lo [2] TAI YO LETTER U..TAI YO LETTER AE +1E6E7..1E6ED ; ALetter # Lo [7] TAI YO LETTER O..TAI YO LETTER AUE +1E6F0..1E6F4 ; ALetter # Lo [5] TAI YO LETTER AN..TAI YO LETTER AP +1E6FE ; ALetter # Lo TAI YO SYMBOL MUEANG +1E6FF ; ALetter # Lm TAI YO XAM LAI +1E7E0..1E7E6 ; ALetter # Lo [7] ETHIOPIC SYLLABLE HHYA..ETHIOPIC SYLLABLE HHYO +1E7E8..1E7EB ; ALetter # Lo [4] ETHIOPIC SYLLABLE GURAGE HHWA..ETHIOPIC SYLLABLE HHWE +1E7ED..1E7EE ; ALetter # Lo [2] ETHIOPIC SYLLABLE GURAGE MWI..ETHIOPIC SYLLABLE GURAGE MWEE +1E7F0..1E7FE ; ALetter # Lo [15] ETHIOPIC SYLLABLE GURAGE QWI..ETHIOPIC SYLLABLE GURAGE PWEE +1E800..1E8C4 ; ALetter # Lo [197] MENDE KIKAKUI SYLLABLE M001 KI..MENDE KIKAKUI SYLLABLE M060 NYON +1E900..1E943 ; ALetter # L& [68] ADLAM CAPITAL LETTER ALIF..ADLAM SMALL LETTER SHA +1E94B ; ALetter # Lm ADLAM NASALIZATION MARK +1EE00..1EE03 ; ALetter # Lo [4] ARABIC MATHEMATICAL ALEF..ARABIC MATHEMATICAL DAL +1EE05..1EE1F ; ALetter # Lo [27] ARABIC MATHEMATICAL WAW..ARABIC MATHEMATICAL DOTLESS QAF +1EE21..1EE22 ; ALetter # Lo [2] ARABIC MATHEMATICAL INITIAL BEH..ARABIC MATHEMATICAL INITIAL JEEM +1EE24 ; ALetter # Lo ARABIC MATHEMATICAL INITIAL HEH +1EE27 ; ALetter # Lo ARABIC MATHEMATICAL INITIAL HAH +1EE29..1EE32 ; ALetter # Lo [10] ARABIC MATHEMATICAL INITIAL YEH..ARABIC MATHEMATICAL INITIAL QAF +1EE34..1EE37 ; ALetter # Lo [4] ARABIC MATHEMATICAL INITIAL SHEEN..ARABIC MATHEMATICAL INITIAL KHAH +1EE39 ; ALetter # Lo ARABIC MATHEMATICAL INITIAL DAD +1EE3B ; ALetter # Lo ARABIC MATHEMATICAL INITIAL GHAIN +1EE42 ; ALetter # Lo ARABIC MATHEMATICAL TAILED JEEM +1EE47 ; ALetter # Lo ARABIC MATHEMATICAL TAILED HAH +1EE49 ; ALetter # Lo ARABIC MATHEMATICAL TAILED YEH +1EE4B ; ALetter # Lo ARABIC MATHEMATICAL TAILED LAM +1EE4D..1EE4F ; ALetter # Lo [3] ARABIC MATHEMATICAL TAILED NOON..ARABIC MATHEMATICAL TAILED AIN +1EE51..1EE52 ; ALetter # Lo [2] ARABIC MATHEMATICAL TAILED SAD..ARABIC MATHEMATICAL TAILED QAF +1EE54 ; ALetter # Lo ARABIC MATHEMATICAL TAILED SHEEN +1EE57 ; ALetter # Lo ARABIC MATHEMATICAL TAILED KHAH +1EE59 ; ALetter # Lo ARABIC MATHEMATICAL TAILED DAD +1EE5B ; ALetter # Lo ARABIC MATHEMATICAL TAILED GHAIN +1EE5D ; ALetter # Lo ARABIC MATHEMATICAL TAILED DOTLESS NOON +1EE5F ; ALetter # Lo ARABIC MATHEMATICAL TAILED DOTLESS QAF +1EE61..1EE62 ; ALetter # Lo [2] ARABIC MATHEMATICAL STRETCHED BEH..ARABIC MATHEMATICAL STRETCHED JEEM +1EE64 ; ALetter # Lo ARABIC MATHEMATICAL STRETCHED HEH +1EE67..1EE6A ; ALetter # Lo [4] ARABIC MATHEMATICAL STRETCHED HAH..ARABIC MATHEMATICAL STRETCHED KAF +1EE6C..1EE72 ; ALetter # Lo [7] ARABIC MATHEMATICAL STRETCHED MEEM..ARABIC MATHEMATICAL STRETCHED QAF +1EE74..1EE77 ; ALetter # Lo [4] ARABIC MATHEMATICAL STRETCHED SHEEN..ARABIC MATHEMATICAL STRETCHED KHAH +1EE79..1EE7C ; ALetter # Lo [4] ARABIC MATHEMATICAL STRETCHED DAD..ARABIC MATHEMATICAL STRETCHED DOTLESS BEH +1EE7E ; ALetter # Lo ARABIC MATHEMATICAL STRETCHED DOTLESS FEH +1EE80..1EE89 ; ALetter # Lo [10] ARABIC MATHEMATICAL LOOPED ALEF..ARABIC MATHEMATICAL LOOPED YEH +1EE8B..1EE9B ; ALetter # Lo [17] ARABIC MATHEMATICAL LOOPED LAM..ARABIC MATHEMATICAL LOOPED GHAIN +1EEA1..1EEA3 ; ALetter # Lo [3] ARABIC MATHEMATICAL DOUBLE-STRUCK BEH..ARABIC MATHEMATICAL DOUBLE-STRUCK DAL +1EEA5..1EEA9 ; ALetter # Lo [5] ARABIC MATHEMATICAL DOUBLE-STRUCK WAW..ARABIC MATHEMATICAL DOUBLE-STRUCK YEH +1EEAB..1EEBB ; ALetter # Lo [17] ARABIC MATHEMATICAL DOUBLE-STRUCK LAM..ARABIC MATHEMATICAL DOUBLE-STRUCK GHAIN +1F130..1F149 ; ALetter # So [26] SQUARED LATIN CAPITAL LETTER A..SQUARED LATIN CAPITAL LETTER Z +1F150..1F169 ; ALetter # So [26] NEGATIVE CIRCLED LATIN CAPITAL LETTER A..NEGATIVE CIRCLED LATIN CAPITAL LETTER Z +1F170..1F189 ; ALetter # So [26] NEGATIVE SQUARED LATIN CAPITAL LETTER A..NEGATIVE SQUARED LATIN CAPITAL LETTER Z + +# Total code points: 33973 + +# ================================================ + +003A ; MidLetter # Po COLON +00B7 ; MidLetter # Po MIDDLE DOT +0387 ; MidLetter # Po GREEK ANO TELEIA +055F ; MidLetter # Po ARMENIAN ABBREVIATION MARK +05F4 ; MidLetter # Po HEBREW PUNCTUATION GERSHAYIM +2027 ; MidLetter # Po HYPHENATION POINT +FE13 ; MidLetter # Po PRESENTATION FORM FOR VERTICAL COLON +FE55 ; MidLetter # Po SMALL COLON +FF1A ; MidLetter # Po FULLWIDTH COLON + +# Total code points: 9 + +# ================================================ + +002C ; MidNum # Po COMMA +003B ; MidNum # Po SEMICOLON +037E ; MidNum # Po GREEK QUESTION MARK +0589 ; MidNum # Po ARMENIAN FULL STOP +060C..060D ; MidNum # Po [2] ARABIC COMMA..ARABIC DATE SEPARATOR +066C ; MidNum # Po ARABIC THOUSANDS SEPARATOR +07F8 ; MidNum # Po NKO COMMA +2044 ; MidNum # Sm FRACTION SLASH +FE50 ; MidNum # Po SMALL COMMA +FE54 ; MidNum # Po SMALL SEMICOLON +FF0C ; MidNum # Po FULLWIDTH COMMA +FF1B ; MidNum # Po FULLWIDTH SEMICOLON + +# Total code points: 13 + +# ================================================ + +002E ; MidNumLet # Po FULL STOP +2018 ; MidNumLet # Pi LEFT SINGLE QUOTATION MARK +2019 ; MidNumLet # Pf RIGHT SINGLE QUOTATION MARK +2024 ; MidNumLet # Po ONE DOT LEADER +FE52 ; MidNumLet # Po SMALL FULL STOP +FF07 ; MidNumLet # Po FULLWIDTH APOSTROPHE +FF0E ; MidNumLet # Po FULLWIDTH FULL STOP + +# Total code points: 7 + +# ================================================ + +0030..0039 ; Numeric # Nd [10] DIGIT ZERO..DIGIT NINE +0600..0605 ; Numeric # Cf [6] ARABIC NUMBER SIGN..ARABIC NUMBER MARK ABOVE +0660..0669 ; Numeric # Nd [10] ARABIC-INDIC DIGIT ZERO..ARABIC-INDIC DIGIT NINE +066B ; Numeric # Po ARABIC DECIMAL SEPARATOR +06DD ; Numeric # Cf ARABIC END OF AYAH +06F0..06F9 ; Numeric # Nd [10] EXTENDED ARABIC-INDIC DIGIT ZERO..EXTENDED ARABIC-INDIC DIGIT NINE +07C0..07C9 ; Numeric # Nd [10] NKO DIGIT ZERO..NKO DIGIT NINE +0890..0891 ; Numeric # Cf [2] ARABIC POUND MARK ABOVE..ARABIC PIASTRE MARK ABOVE +08E2 ; Numeric # Cf ARABIC DISPUTED END OF AYAH +0966..096F ; Numeric # Nd [10] DEVANAGARI DIGIT ZERO..DEVANAGARI DIGIT NINE +09E6..09EF ; Numeric # Nd [10] BENGALI DIGIT ZERO..BENGALI DIGIT NINE +0A66..0A6F ; Numeric # Nd [10] GURMUKHI DIGIT ZERO..GURMUKHI DIGIT NINE +0AE6..0AEF ; Numeric # Nd [10] GUJARATI DIGIT ZERO..GUJARATI DIGIT NINE +0B66..0B6F ; Numeric # Nd [10] ORIYA DIGIT ZERO..ORIYA DIGIT NINE +0BE6..0BEF ; Numeric # Nd [10] TAMIL DIGIT ZERO..TAMIL DIGIT NINE +0C66..0C6F ; Numeric # Nd [10] TELUGU DIGIT ZERO..TELUGU DIGIT NINE +0CE6..0CEF ; Numeric # Nd [10] KANNADA DIGIT ZERO..KANNADA DIGIT NINE +0D66..0D6F ; Numeric # Nd [10] MALAYALAM DIGIT ZERO..MALAYALAM DIGIT NINE +0DE6..0DEF ; Numeric # Nd [10] SINHALA LITH DIGIT ZERO..SINHALA LITH DIGIT NINE +0E50..0E59 ; Numeric # Nd [10] THAI DIGIT ZERO..THAI DIGIT NINE +0ED0..0ED9 ; Numeric # Nd [10] LAO DIGIT ZERO..LAO DIGIT NINE +0F20..0F29 ; Numeric # Nd [10] TIBETAN DIGIT ZERO..TIBETAN DIGIT NINE +1040..1049 ; Numeric # Nd [10] MYANMAR DIGIT ZERO..MYANMAR DIGIT NINE +1090..1099 ; Numeric # Nd [10] MYANMAR SHAN DIGIT ZERO..MYANMAR SHAN DIGIT NINE +17E0..17E9 ; Numeric # Nd [10] KHMER DIGIT ZERO..KHMER DIGIT NINE +1810..1819 ; Numeric # Nd [10] MONGOLIAN DIGIT ZERO..MONGOLIAN DIGIT NINE +1946..194F ; Numeric # Nd [10] LIMBU DIGIT ZERO..LIMBU DIGIT NINE +19D0..19D9 ; Numeric # Nd [10] NEW TAI LUE DIGIT ZERO..NEW TAI LUE DIGIT NINE +19DA ; Numeric # No NEW TAI LUE THAM DIGIT ONE +1A80..1A89 ; Numeric # Nd [10] TAI THAM HORA DIGIT ZERO..TAI THAM HORA DIGIT NINE +1A90..1A99 ; Numeric # Nd [10] TAI THAM THAM DIGIT ZERO..TAI THAM THAM DIGIT NINE +1B50..1B59 ; Numeric # Nd [10] BALINESE DIGIT ZERO..BALINESE DIGIT NINE +1BB0..1BB9 ; Numeric # Nd [10] SUNDANESE DIGIT ZERO..SUNDANESE DIGIT NINE +1C40..1C49 ; Numeric # Nd [10] LEPCHA DIGIT ZERO..LEPCHA DIGIT NINE +1C50..1C59 ; Numeric # Nd [10] OL CHIKI DIGIT ZERO..OL CHIKI DIGIT NINE +A620..A629 ; Numeric # Nd [10] VAI DIGIT ZERO..VAI DIGIT NINE +A8D0..A8D9 ; Numeric # Nd [10] SAURASHTRA DIGIT ZERO..SAURASHTRA DIGIT NINE +A900..A909 ; Numeric # Nd [10] KAYAH LI DIGIT ZERO..KAYAH LI DIGIT NINE +A9D0..A9D9 ; Numeric # Nd [10] JAVANESE DIGIT ZERO..JAVANESE DIGIT NINE +A9F0..A9F9 ; Numeric # Nd [10] MYANMAR TAI LAING DIGIT ZERO..MYANMAR TAI LAING DIGIT NINE +AA50..AA59 ; Numeric # Nd [10] CHAM DIGIT ZERO..CHAM DIGIT NINE +ABF0..ABF9 ; Numeric # Nd [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DIGIT NINE +FF10..FF19 ; Numeric # Nd [10] FULLWIDTH DIGIT ZERO..FULLWIDTH DIGIT NINE +104A0..104A9 ; Numeric # Nd [10] OSMANYA DIGIT ZERO..OSMANYA DIGIT NINE +10D30..10D39 ; Numeric # Nd [10] HANIFI ROHINGYA DIGIT ZERO..HANIFI ROHINGYA DIGIT NINE +10D40..10D49 ; Numeric # Nd [10] GARAY DIGIT ZERO..GARAY DIGIT NINE +11066..1106F ; Numeric # Nd [10] BRAHMI DIGIT ZERO..BRAHMI DIGIT NINE +110BD ; Numeric # Cf KAITHI NUMBER SIGN +110CD ; Numeric # Cf KAITHI NUMBER SIGN ABOVE +110F0..110F9 ; Numeric # Nd [10] SORA SOMPENG DIGIT ZERO..SORA SOMPENG DIGIT NINE +11136..1113F ; Numeric # Nd [10] CHAKMA DIGIT ZERO..CHAKMA DIGIT NINE +111D0..111D9 ; Numeric # Nd [10] SHARADA DIGIT ZERO..SHARADA DIGIT NINE +112F0..112F9 ; Numeric # Nd [10] KHUDAWADI DIGIT ZERO..KHUDAWADI DIGIT NINE +11450..11459 ; Numeric # Nd [10] NEWA DIGIT ZERO..NEWA DIGIT NINE +114D0..114D9 ; Numeric # Nd [10] TIRHUTA DIGIT ZERO..TIRHUTA DIGIT NINE +11650..11659 ; Numeric # Nd [10] MODI DIGIT ZERO..MODI DIGIT NINE +116C0..116C9 ; Numeric # Nd [10] TAKRI DIGIT ZERO..TAKRI DIGIT NINE +116D0..116E3 ; Numeric # Nd [20] MYANMAR PAO DIGIT ZERO..MYANMAR EASTERN PWO KAREN DIGIT NINE +11730..11739 ; Numeric # Nd [10] AHOM DIGIT ZERO..AHOM DIGIT NINE +118E0..118E9 ; Numeric # Nd [10] WARANG CITI DIGIT ZERO..WARANG CITI DIGIT NINE +11950..11959 ; Numeric # Nd [10] DIVES AKURU DIGIT ZERO..DIVES AKURU DIGIT NINE +11BF0..11BF9 ; Numeric # Nd [10] SUNUWAR DIGIT ZERO..SUNUWAR DIGIT NINE +11C50..11C59 ; Numeric # Nd [10] BHAIKSUKI DIGIT ZERO..BHAIKSUKI DIGIT NINE +11D50..11D59 ; Numeric # Nd [10] MASARAM GONDI DIGIT ZERO..MASARAM GONDI DIGIT NINE +11DA0..11DA9 ; Numeric # Nd [10] GUNJALA GONDI DIGIT ZERO..GUNJALA GONDI DIGIT NINE +11DE0..11DE9 ; Numeric # Nd [10] TOLONG SIKI DIGIT ZERO..TOLONG SIKI DIGIT NINE +11F50..11F59 ; Numeric # Nd [10] KAWI DIGIT ZERO..KAWI DIGIT NINE +16130..16139 ; Numeric # Nd [10] GURUNG KHEMA DIGIT ZERO..GURUNG KHEMA DIGIT NINE +16A60..16A69 ; Numeric # Nd [10] MRO DIGIT ZERO..MRO DIGIT NINE +16AC0..16AC9 ; Numeric # Nd [10] TANGSA DIGIT ZERO..TANGSA DIGIT NINE +16B50..16B59 ; Numeric # Nd [10] PAHAWH HMONG DIGIT ZERO..PAHAWH HMONG DIGIT NINE +16D70..16D79 ; Numeric # Nd [10] KIRAT RAI DIGIT ZERO..KIRAT RAI DIGIT NINE +1CCF0..1CCF9 ; Numeric # Nd [10] OUTLINED DIGIT ZERO..OUTLINED DIGIT NINE +1D7CE..1D7FF ; Numeric # Nd [50] MATHEMATICAL BOLD DIGIT ZERO..MATHEMATICAL MONOSPACE DIGIT NINE +1E140..1E149 ; Numeric # Nd [10] NYIAKENG PUACHUE HMONG DIGIT ZERO..NYIAKENG PUACHUE HMONG DIGIT NINE +1E2F0..1E2F9 ; Numeric # Nd [10] WANCHO DIGIT ZERO..WANCHO DIGIT NINE +1E4F0..1E4F9 ; Numeric # Nd [10] NAG MUNDARI DIGIT ZERO..NAG MUNDARI DIGIT NINE +1E5F1..1E5FA ; Numeric # Nd [10] OL ONAL DIGIT ZERO..OL ONAL DIGIT NINE +1E950..1E959 ; Numeric # Nd [10] ADLAM DIGIT ZERO..ADLAM DIGIT NINE +1FBF0..1FBF9 ; Numeric # Nd [10] SEGMENTED DIGIT ZERO..SEGMENTED DIGIT NINE + +# Total code points: 784 + +# ================================================ + +005F ; ExtendNumLet # Pc LOW LINE +202F ; ExtendNumLet # Zs NARROW NO-BREAK SPACE +203F..2040 ; ExtendNumLet # Pc [2] UNDERTIE..CHARACTER TIE +2054 ; ExtendNumLet # Pc INVERTED UNDERTIE +FE33..FE34 ; ExtendNumLet # Pc [2] PRESENTATION FORM FOR VERTICAL LOW LINE..PRESENTATION FORM FOR VERTICAL WAVY LOW LINE +FE4D..FE4F ; ExtendNumLet # Pc [3] DASHED LOW LINE..WAVY LOW LINE +FF3F ; ExtendNumLet # Pc FULLWIDTH LOW LINE + +# Total code points: 11 + +# ================================================ + +200D ; ZWJ # Cf ZERO WIDTH JOINER + +# Total code points: 1 + +# ================================================ + +0020 ; WSegSpace # Zs SPACE +1680 ; WSegSpace # Zs OGHAM SPACE MARK +2000..2006 ; WSegSpace # Zs [7] EN QUAD..SIX-PER-EM SPACE +2008..200A ; WSegSpace # Zs [3] PUNCTUATION SPACE..HAIR SPACE +205F ; WSegSpace # Zs MEDIUM MATHEMATICAL SPACE +3000 ; WSegSpace # Zs IDEOGRAPHIC SPACE + +# Total code points: 14 + +# EOF diff --git a/opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/util/normalizer/confusables.txt b/opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/util/normalizer/confusables.txt new file mode 100644 index 000000000..d52b6278a --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/util/normalizer/confusables.txt @@ -0,0 +1,9994 @@ +# confusables.txt +# Date: 2025-07-22, 05:49:37 GMT +# © 2025 Unicode®, Inc. +# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries. +# For terms of use and license, see https://www.unicode.org/terms_of_use.html +# +# Unicode Security Mechanisms for UTS #39 +# Version: 17.0.0 +# +# For documentation and usage, see https://www.unicode.org/reports/tr39 +# +05AD ; 0596 ; MA # ( ֭ → ֖ ) HEBREW ACCENT DEHI → HEBREW ACCENT TIPEHA # + +05AE ; 0598 ; MA # ( ֮ → ֘ ) HEBREW ACCENT ZINOR → HEBREW ACCENT ZARQA # + +05A8 ; 0599 ; MA # ( ֨ → ֙ ) HEBREW ACCENT QADMA → HEBREW ACCENT PASHTA # + +05A4 ; 059A ; MA # ( ֤ → ֚ ) HEBREW ACCENT MAHAPAKH → HEBREW ACCENT YETIV # + +1AB4 ; 06DB ; MA # ( ᪴ → ۛ ) COMBINING TRIPLE DOT → ARABIC SMALL HIGH THREE DOTS # +20DB ; 06DB ; MA # ( ⃛ → ۛ ) COMBINING THREE DOTS ABOVE → ARABIC SMALL HIGH THREE DOTS # →᪴→ + +0619 ; 0313 ; MA # ( ؙ → ̓ ) ARABIC SMALL DAMMA → COMBINING COMMA ABOVE # →ُ→ +08F3 ; 0313 ; MA # ( ࣳ → ̓ ) ARABIC SMALL HIGH WAW → COMBINING COMMA ABOVE # →ُ→ +0343 ; 0313 ; MA # ( ̓ → ̓ ) COMBINING GREEK KORONIS → COMBINING COMMA ABOVE # +0315 ; 0313 ; MA # ( ̕ → ̓ ) COMBINING COMMA ABOVE RIGHT → COMBINING COMMA ABOVE # +064F ; 0313 ; MA # ( ُ → ̓ ) ARABIC DAMMA → COMBINING COMMA ABOVE # + +065D ; 0314 ; MA # ( ٝ → ̔ ) ARABIC REVERSED DAMMA → COMBINING REVERSED COMMA ABOVE # + +059C ; 0301 ; MA # ( ֜ → ́ ) HEBREW ACCENT GERESH → COMBINING ACUTE ACCENT # +059D ; 0301 ; MA # ( ֝ → ́ ) HEBREW ACCENT GERESH MUQDAM → COMBINING ACUTE ACCENT # →֜→ +0618 ; 0301 ; MA # ( ؘ → ́ ) ARABIC SMALL FATHA → COMBINING ACUTE ACCENT # →َ→ +0747 ; 0301 ; MA # ( ݇ → ́ ) SYRIAC OBLIQUE LINE ABOVE → COMBINING ACUTE ACCENT # +0341 ; 0301 ; MA # ( ́ → ́ ) COMBINING ACUTE TONE MARK → COMBINING ACUTE ACCENT # +0954 ; 0301 ; MA # ( ॔ → ́ ) DEVANAGARI ACUTE ACCENT → COMBINING ACUTE ACCENT # +064E ; 0301 ; MA # ( َ → ́ ) ARABIC FATHA → COMBINING ACUTE ACCENT # + +0340 ; 0300 ; MA # ( ̀ → ̀ ) COMBINING GRAVE TONE MARK → COMBINING GRAVE ACCENT # +0953 ; 0300 ; MA # ( ॓ → ̀ ) DEVANAGARI GRAVE ACCENT → COMBINING GRAVE ACCENT # + +030C ; 0306 ; MA # ( ̌ → ̆ ) COMBINING CARON → COMBINING BREVE # +A67C ; 0306 ; MA # ( ꙼ → ̆ ) COMBINING CYRILLIC KAVYKA → COMBINING BREVE # +0658 ; 0306 ; MA # ( ٘ → ̆ ) ARABIC MARK NOON GHUNNA → COMBINING BREVE # +065A ; 0306 ; MA # ( ٚ → ̆ ) ARABIC VOWEL SIGN SMALL V ABOVE → COMBINING BREVE # →̌→ +036E ; 0306 ; MA # ( ͮ → ̆ ) COMBINING LATIN SMALL LETTER V → COMBINING BREVE # →̌→ +0945 ; 0306 ; MA # ( ॅ → ̆ ) DEVANAGARI VOWEL SIGN CANDRA E → COMBINING BREVE # +11B66 ; 0306 ; MA # ( 𑭦 → ̆ ) SHARADA VOWEL SIGN CANDRA E → COMBINING BREVE # →ॅ→ + +06E8 ; 0306 0307 ; MA # ( ۨ → ̆̇ ) ARABIC SMALL HIGH NOON → COMBINING BREVE, COMBINING DOT ABOVE # →̐→ +0310 ; 0306 0307 ; MA # ( ̐ → ̆̇ ) COMBINING CANDRABINDU → COMBINING BREVE, COMBINING DOT ABOVE # +0901 ; 0306 0307 ; MA # ( ँ → ̆̇ ) DEVANAGARI SIGN CANDRABINDU → COMBINING BREVE, COMBINING DOT ABOVE # →̐→ +0981 ; 0306 0307 ; MA # ( ঁ → ̆̇ ) BENGALI SIGN CANDRABINDU → COMBINING BREVE, COMBINING DOT ABOVE # →̐→ +0A81 ; 0306 0307 ; MA # ( ઁ → ̆̇ ) GUJARATI SIGN CANDRABINDU → COMBINING BREVE, COMBINING DOT ABOVE # →̐→ +0B01 ; 0306 0307 ; MA # ( ଁ → ̆̇ ) ORIYA SIGN CANDRABINDU → COMBINING BREVE, COMBINING DOT ABOVE # →̐→ +0C00 ; 0306 0307 ; MA # ( ఀ → ̆̇ ) TELUGU SIGN COMBINING CANDRABINDU ABOVE → COMBINING BREVE, COMBINING DOT ABOVE # →ँ→→̐→ +0C81 ; 0306 0307 ; MA # ( ಁ → ̆̇ ) KANNADA SIGN CANDRABINDU → COMBINING BREVE, COMBINING DOT ABOVE # →ँ→→̐→ +0D01 ; 0306 0307 ; MA # ( ഁ → ̆̇ ) MALAYALAM SIGN CANDRABINDU → COMBINING BREVE, COMBINING DOT ABOVE # →ँ→→̐→ +114BF ; 0306 0307 ; MA # ( 𑒿 → ̆̇ ) TIRHUTA SIGN CANDRABINDU → COMBINING BREVE, COMBINING DOT ABOVE # →ঁ→→̐→ + +1CD0 ; 0302 ; MA # ( ᳐ → ̂ ) VEDIC TONE KARSHANA → COMBINING CIRCUMFLEX ACCENT # +0311 ; 0302 ; MA # ( ̑ → ̂ ) COMBINING INVERTED BREVE → COMBINING CIRCUMFLEX ACCENT # +065B ; 0302 ; MA # ( ٛ → ̂ ) ARABIC VOWEL SIGN INVERTED SMALL V ABOVE → COMBINING CIRCUMFLEX ACCENT # +07EE ; 0302 ; MA # ( ߮ → ̂ ) NKO COMBINING LONG DESCENDING TONE → COMBINING CIRCUMFLEX ACCENT # +A6F0 ; 0302 ; MA # ( ꛰ → ̂ ) BAMUM COMBINING MARK KOQNDON → COMBINING CIRCUMFLEX ACCENT # + +05AF ; 030A ; MA # ( ֯ → ̊ ) HEBREW MARK MASORA CIRCLE → COMBINING RING ABOVE # +06DF ; 030A ; MA # ( ۟ → ̊ ) ARABIC SMALL HIGH ROUNDED ZERO → COMBINING RING ABOVE # →ْ→ +17D3 ; 030A ; MA # ( ៓ → ̊ ) KHMER SIGN BATHAMASAT → COMBINING RING ABOVE # +309A ; 030A ; MA # ( ゚ → ̊ ) COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK → COMBINING RING ABOVE # +0652 ; 030A ; MA # ( ْ → ̊ ) ARABIC SUKUN → COMBINING RING ABOVE # +0B82 ; 030A ; MA # ( ஂ → ̊ ) TAMIL SIGN ANUSVARA → COMBINING RING ABOVE # +1036 ; 030A ; MA # ( ံ → ̊ ) MYANMAR SIGN ANUSVARA → COMBINING RING ABOVE # +17C6 ; 030A ; MA # ( ំ → ̊ ) KHMER SIGN NIKAHIT → COMBINING RING ABOVE # +11300 ; 030A ; MA # ( 𑌀 → ̊ ) GRANTHA SIGN COMBINING ANUSVARA ABOVE → COMBINING RING ABOVE # →ஂ→ +0E4D ; 030A ; MA # ( ํ → ̊ ) THAI CHARACTER NIKHAHIT → COMBINING RING ABOVE # +0ECD ; 030A ; MA # ( ໍ → ̊ ) LAO NIGGAHITA → COMBINING RING ABOVE # +0366 ; 030A ; MA # ( ͦ → ̊ ) COMBINING LATIN SMALL LETTER O → COMBINING RING ABOVE # +2DEA ; 030A ; MA # ( ⷪ → ̊ ) COMBINING CYRILLIC LETTER O → COMBINING RING ABOVE # →ͦ→ + +08EB ; 0308 ; MA # ( ࣫ → ̈ ) ARABIC TONE TWO DOTS ABOVE → COMBINING DIAERESIS # +07F3 ; 0308 ; MA # ( ߳ → ̈ ) NKO COMBINING DOUBLE DOT ABOVE → COMBINING DIAERESIS # + +064B ; 030B ; MA # ( ً → ̋ ) ARABIC FATHATAN → COMBINING DOUBLE ACUTE ACCENT # +08F0 ; 030B ; MA # ( ࣰ → ̋ ) ARABIC OPEN FATHATAN → COMBINING DOUBLE ACUTE ACCENT # →ً→ + +0342 ; 0303 ; MA # ( ͂ → ̃ ) COMBINING GREEK PERISPOMENI → COMBINING TILDE # +0653 ; 0303 ; MA # ( ٓ → ̃ ) ARABIC MADDAH ABOVE → COMBINING TILDE # + +05C4 ; 0307 ; MA # ( ׄ → ̇ ) HEBREW MARK UPPER DOT → COMBINING DOT ABOVE # +06EC ; 0307 ; MA # ( ۬ → ̇ ) ARABIC ROUNDED HIGH STOP WITH FILLED CENTRE → COMBINING DOT ABOVE # +0740 ; 0307 ; MA # ( ݀ → ̇ ) SYRIAC FEMININE DOT → COMBINING DOT ABOVE # →݁→ +08EA ; 0307 ; MA # ( ࣪ → ̇ ) ARABIC TONE ONE DOT ABOVE → COMBINING DOT ABOVE # +0741 ; 0307 ; MA # ( ݁ → ̇ ) SYRIAC QUSHSHAYA → COMBINING DOT ABOVE # +0358 ; 0307 ; MA # ( ͘ → ̇ ) COMBINING DOT ABOVE RIGHT → COMBINING DOT ABOVE # +05B9 ; 0307 ; MA # ( ֹ → ̇ ) HEBREW POINT HOLAM → COMBINING DOT ABOVE # +05BA ; 0307 ; MA # ( ֺ → ̇ ) HEBREW POINT HOLAM HASER FOR VAV → COMBINING DOT ABOVE # →ׁ→ +05C2 ; 0307 ; MA # ( ׂ → ̇ ) HEBREW POINT SIN DOT → COMBINING DOT ABOVE # +05C1 ; 0307 ; MA # ( ׁ → ̇ ) HEBREW POINT SHIN DOT → COMBINING DOT ABOVE # +07ED ; 0307 ; MA # ( ߭ → ̇ ) NKO COMBINING SHORT RISING TONE → COMBINING DOT ABOVE # +0902 ; 0307 ; MA # ( ं → ̇ ) DEVANAGARI SIGN ANUSVARA → COMBINING DOT ABOVE # +0A02 ; 0307 ; MA # ( ਂ → ̇ ) GURMUKHI SIGN BINDI → COMBINING DOT ABOVE # +0A82 ; 0307 ; MA # ( ં → ̇ ) GUJARATI SIGN ANUSVARA → COMBINING DOT ABOVE # +0BCD ; 0307 ; MA # ( ் → ̇ ) TAMIL SIGN VIRAMA → COMBINING DOT ABOVE # + +0337 ; 0338 ; MA # ( ̷ → ̸ ) COMBINING SHORT SOLIDUS OVERLAY → COMBINING LONG SOLIDUS OVERLAY # + +1AB7 ; 0328 ; MA # ( ᪷ → ̨ ) COMBINING OPEN MARK BELOW → COMBINING OGONEK # +0322 ; 0328 ; MA # ( ̢ → ̨ ) COMBINING RETROFLEX HOOK BELOW → COMBINING OGONEK # +0345 ; 0328 ; MA # ( ͅ → ̨ ) COMBINING GREEK YPOGEGRAMMENI → COMBINING OGONEK # + +1CD2 ; 0304 ; MA # ( ᳒ → ̄ ) VEDIC TONE PRENKHA → COMBINING MACRON # +0305 ; 0304 ; MA # ( ̅ → ̄ ) COMBINING OVERLINE → COMBINING MACRON # +0659 ; 0304 ; MA # ( ٙ → ̄ ) ARABIC ZWARAKAY → COMBINING MACRON # +07EB ; 0304 ; MA # ( ߫ → ̄ ) NKO COMBINING SHORT HIGH TONE → COMBINING MACRON # +A6F1 ; 0304 ; MA # ( ꛱ → ̄ ) BAMUM COMBINING MARK TUKWENTIS → COMBINING MACRON # +1AE2 ; 0304 ; MA # ( ᫢ → ̄ ) COMBINING MINUS SIGN ABOVE → COMBINING MACRON # + +1AE8 ; 0304 0304 ; MA # ( ᫨ → ̄̄ ) COMBINING EQUALS SIGN ABOVE → COMBINING MACRON, COMBINING MACRON # + +1CDA ; 030E ; MA # ( ᳚ → ̎ ) VEDIC TONE DOUBLE SVARITA → COMBINING DOUBLE VERTICAL LINE ABOVE # + +0657 ; 0312 ; MA # ( ٗ → ̒ ) ARABIC INVERTED DAMMA → COMBINING TURNED COMMA ABOVE # + +0357 ; 0350 ; MA # ( ͗ → ͐ ) COMBINING RIGHT HALF RING ABOVE → COMBINING RIGHT ARROWHEAD ABOVE # →ࣿ→→ࣸ→ +08FF ; 0350 ; MA # ( ࣿ → ͐ ) ARABIC MARK SIDEWAYS NOON GHUNNA → COMBINING RIGHT ARROWHEAD ABOVE # →ࣸ→ +08F8 ; 0350 ; MA # ( ࣸ → ͐ ) ARABIC RIGHT ARROWHEAD ABOVE → COMBINING RIGHT ARROWHEAD ABOVE # + +0900 ; 0352 ; MA # ( ऀ → ͒ ) DEVANAGARI SIGN INVERTED CANDRABINDU → COMBINING FERMATA # + +1AD9 ; 1AC6 ; MA # ( ᫙ → ᫆ ) COMBINING SHARP SIGN → COMBINING NUMBER SIGN ABOVE # + +1E6EE ; 1AC8 ; MA # ( 𞛮 → ᫈ ) TAI YO SIGN AY → COMBINING PLUS SIGN ABOVE # + +1CED ; 0316 ; MA # ( ᳭ → ̖ ) VEDIC SIGN TIRYAK → COMBINING GRAVE ACCENT BELOW # + +1CDC ; 0329 ; MA # ( ᳜ → ̩ ) VEDIC TONE KATHAKA ANUDATTA → COMBINING VERTICAL LINE BELOW # +0656 ; 0329 ; MA # ( ٖ → ̩ ) ARABIC SUBSCRIPT ALEF → COMBINING VERTICAL LINE BELOW # + +1CD5 ; 032B ; MA # ( ᳕ → ̫ ) VEDIC TONE YAJURVEDIC AGGRAVATED INDEPENDENT SVARITA → COMBINING INVERTED DOUBLE ARCH BELOW # + +0347 ; 0333 ; MA # ( ͇ → ̳ ) COMBINING EQUALS SIGN BELOW → COMBINING DOUBLE LOW LINE # + +08F9 ; 0354 ; MA # ( ࣹ → ͔ ) ARABIC LEFT ARROWHEAD BELOW → COMBINING LEFT ARROWHEAD BELOW # + +08FA ; 0355 ; MA # ( ࣺ → ͕ ) ARABIC RIGHT ARROWHEAD BELOW → COMBINING RIGHT ARROWHEAD BELOW # + +309B ; FF9E ; MA #* ( ゛ → ゙ ) KATAKANA-HIRAGANA VOICED SOUND MARK → HALFWIDTH KATAKANA VOICED SOUND MARK # + +309C ; FF9F ; MA #* ( ゜ → ゚ ) KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK → HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK # + +0336 ; 0335 ; MA # ( ̶ → ̵ ) COMBINING LONG STROKE OVERLAY → COMBINING SHORT STROKE OVERLAY # + +302C ; 0309 ; MA # ( 〬 → ̉ ) IDEOGRAPHIC DEPARTING TONE MARK → COMBINING HOOK ABOVE # + +05C5 ; 0323 ; MA # ( ׅ → ̣ ) HEBREW MARK LOWER DOT → COMBINING DOT BELOW # +08ED ; 0323 ; MA # ( ࣭ → ̣ ) ARABIC TONE ONE DOT BELOW → COMBINING DOT BELOW # +1CDD ; 0323 ; MA # ( ᳝ → ̣ ) VEDIC TONE DOT BELOW → COMBINING DOT BELOW # +05B4 ; 0323 ; MA # ( ִ → ̣ ) HEBREW POINT HIRIQ → COMBINING DOT BELOW # +065C ; 0323 ; MA # ( ٜ → ̣ ) ARABIC VOWEL SIGN DOT BELOW → COMBINING DOT BELOW # +093C ; 0323 ; MA # ( ़ → ̣ ) DEVANAGARI SIGN NUKTA → COMBINING DOT BELOW # +09BC ; 0323 ; MA # ( ় → ̣ ) BENGALI SIGN NUKTA → COMBINING DOT BELOW # +0A3C ; 0323 ; MA # ( ਼ → ̣ ) GURMUKHI SIGN NUKTA → COMBINING DOT BELOW # +0ABC ; 0323 ; MA # ( ઼ → ̣ ) GUJARATI SIGN NUKTA → COMBINING DOT BELOW # +0B3C ; 0323 ; MA # ( ଼ → ̣ ) ORIYA SIGN NUKTA → COMBINING DOT BELOW # +111CA ; 0323 ; MA # ( 𑇊 → ̣ ) SHARADA SIGN NUKTA → COMBINING DOT BELOW # →़→ +114C3 ; 0323 ; MA # ( 𑓃 → ̣ ) TIRHUTA SIGN NUKTA → COMBINING DOT BELOW # →়→ +10A3A ; 0323 ; MA # ( 𐨺 → ̣ ) KHAROSHTHI SIGN DOT BELOW → COMBINING DOT BELOW # + +08EE ; 0324 ; MA # ( ࣮ → ̤ ) ARABIC TONE TWO DOTS BELOW → COMBINING DIAERESIS BELOW # +1CDE ; 0324 ; MA # ( ᳞ → ̤ ) VEDIC TONE TWO DOTS BELOW → COMBINING DIAERESIS BELOW # + +0F37 ; 0325 ; MA # ( ༷ → ̥ ) TIBETAN MARK NGAS BZUNG SGOR RTAGS → COMBINING RING BELOW # +302D ; 0325 ; MA # ( 〭 → ̥ ) IDEOGRAPHIC ENTERING TONE MARK → COMBINING RING BELOW # + +0327 ; 0326 ; MA # ( ̧ → ̦ ) COMBINING CEDILLA → COMBINING COMMA BELOW # →̡→ +0321 ; 0326 ; MA # ( ̡ → ̦ ) COMBINING PALATALIZED HOOK BELOW → COMBINING COMMA BELOW # +0339 ; 0326 ; MA # ( ̹ → ̦ ) COMBINING RIGHT HALF RING BELOW → COMBINING COMMA BELOW # →̧→→̡→ + +1CD9 ; 032D ; MA # ( ᳙ → ̭ ) VEDIC TONE YAJURVEDIC KATHAKA INDEPENDENT SVARITA SCHROEDER → COMBINING CIRCUMFLEX ACCENT BELOW # + +1CD8 ; 032E ; MA # ( ᳘ → ̮ ) VEDIC TONE CANDRA BELOW → COMBINING BREVE BELOW # + +0952 ; 0331 ; MA # ( ॒ → ̱ ) DEVANAGARI STRESS SIGN ANUDATTA → COMBINING MACRON BELOW # +0320 ; 0331 ; MA # ( ̠ → ̱ ) COMBINING MINUS SIGN BELOW → COMBINING MACRON BELOW # + +08F1 ; 064C ; MA # ( ࣱ → ٌ ) ARABIC OPEN DAMMATAN → ARABIC DAMMATAN # +08E8 ; 064C ; MA # ( ࣨ → ٌ ) ARABIC CURLY DAMMATAN → ARABIC DAMMATAN # +08E5 ; 064C ; MA # ( ࣥ → ٌ ) ARABIC CURLY DAMMA → ARABIC DAMMATAN # + +FC5E ; FE72 0651 ; MA #* ( ‎ﱞ‎ → ‎ﹲّ‎ ) ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED FORM → ARABIC DAMMATAN ISOLATED FORM, ARABIC SHADDA # + +08F2 ; 064D ; MA # ( ࣲ → ٍ ) ARABIC OPEN KASRATAN → ARABIC KASRATAN # + +FC5F ; FE74 0651 ; MA #* ( ‎ﱟ‎ → ‎ﹴّ‎ ) ARABIC LIGATURE SHADDA WITH KASRATAN ISOLATED FORM → ARABIC KASRATAN ISOLATED FORM, ARABIC SHADDA # + +FCF2 ; FE77 0651 ; MA # ( ‎ﳲ‎ → ‎ﹷّ‎ ) ARABIC LIGATURE SHADDA WITH FATHA MEDIAL FORM → ARABIC FATHA MEDIAL FORM, ARABIC SHADDA # + +FC60 ; FE76 0651 ; MA #* ( ‎ﱠ‎ → ‎ﹶّ‎ ) ARABIC LIGATURE SHADDA WITH FATHA ISOLATED FORM → ARABIC FATHA ISOLATED FORM, ARABIC SHADDA # + +FCF3 ; FE79 0651 ; MA # ( ‎ﳳ‎ → ‎ﹹّ‎ ) ARABIC LIGATURE SHADDA WITH DAMMA MEDIAL FORM → ARABIC DAMMA MEDIAL FORM, ARABIC SHADDA # + +FC61 ; FE78 0651 ; MA #* ( ‎ﱡ‎ → ‎ﹸّ‎ ) ARABIC LIGATURE SHADDA WITH DAMMA ISOLATED FORM → ARABIC DAMMA ISOLATED FORM, ARABIC SHADDA # + +061A ; 0650 ; MA # ( ؚ → ِ ) ARABIC SMALL KASRA → ARABIC KASRA # +0317 ; 0650 ; MA # ( ̗ → ِ ) COMBINING ACUTE ACCENT BELOW → ARABIC KASRA # + +FCF4 ; FE7B 0651 ; MA # ( ‎ﳴ‎ → ‎ﹻّ‎ ) ARABIC LIGATURE SHADDA WITH KASRA MEDIAL FORM → ARABIC KASRA MEDIAL FORM, ARABIC SHADDA # + +FC62 ; FE7A 0651 ; MA #* ( ‎ﱢ‎ → ‎ﹺّ‎ ) ARABIC LIGATURE SHADDA WITH KASRA ISOLATED FORM → ARABIC KASRA ISOLATED FORM, ARABIC SHADDA # + +FC63 ; FE7C 0670 ; MA #* ( ‎ﱣ‎ → ‎ﹼٰ‎ ) ARABIC LIGATURE SHADDA WITH SUPERSCRIPT ALEF ISOLATED FORM → ARABIC SHADDA ISOLATED FORM, ARABIC LETTER SUPERSCRIPT ALEF # + +065F ; 0655 ; MA # ( ٟ → ٕ ) ARABIC WAVY HAMZA BELOW → ARABIC HAMZA BELOW # + +030D ; 0670 ; MA # ( ̍ → ٰ ) COMBINING VERTICAL LINE ABOVE → ARABIC LETTER SUPERSCRIPT ALEF # + +0742 ; 073C ; MA # ( ݂ → ܼ ) SYRIAC RUKKAKHA → SYRIAC HBASA-ESASA DOTTED # + +0A03 ; 0983 ; MA # ( ਃ → ঃ ) GURMUKHI SIGN VISARGA → BENGALI SIGN VISARGA # +0C03 ; 0983 ; MA # ( ః → ঃ ) TELUGU SIGN VISARGA → BENGALI SIGN VISARGA # →ਃ→ +0C83 ; 0983 ; MA # ( ಃ → ঃ ) KANNADA SIGN VISARGA → BENGALI SIGN VISARGA # →ః→→ਃ→ +0D03 ; 0983 ; MA # ( ഃ → ঃ ) MALAYALAM SIGN VISARGA → BENGALI SIGN VISARGA # →ಃ→→ః→→ਃ→ +0D83 ; 0983 ; MA # ( ඃ → ঃ ) SINHALA SIGN VISARGAYA → BENGALI SIGN VISARGA # →ഃ→→ಃ→→ః→→ਃ→ +1038 ; 0983 ; MA # ( း → ঃ ) MYANMAR SIGN VISARGA → BENGALI SIGN VISARGA # →ඃ→→ഃ→→ಃ→→ః→→ਃ→ +114C1 ; 0983 ; MA # ( 𑓁 → ঃ ) TIRHUTA SIGN VISARGA → BENGALI SIGN VISARGA # + +17CB ; 0E48 ; MA # ( ់ → ่ ) KHMER SIGN BANTOC → THAI CHARACTER MAI EK # +0EC8 ; 0E48 ; MA # ( ່ → ่ ) LAO TONE MAI EK → THAI CHARACTER MAI EK # + +0EC9 ; 0E49 ; MA # ( ້ → ้ ) LAO TONE MAI THO → THAI CHARACTER MAI THO # + +0ECA ; 0E4A ; MA # ( ໊ → ๊ ) LAO TONE MAI TI → THAI CHARACTER MAI TRI # + +0ECB ; 0E4B ; MA # ( ໋ → ๋ ) LAO TONE MAI CATAWA → THAI CHARACTER MAI CHATTAWA # + +A66F ; 20E9 ; MA # ( ꙯ → ⃩ ) COMBINING CYRILLIC VZMET → COMBINING WIDE BRIDGE ABOVE # + +2028 ; 0020 ; MA #* ( → ) LINE SEPARATOR → SPACE # +2029 ; 0020 ; MA #* ( → ) PARAGRAPH SEPARATOR → SPACE # +1680 ; 0020 ; MA #* (   → ) OGHAM SPACE MARK → SPACE # +2000 ; 0020 ; MA #* (   → ) EN QUAD → SPACE # +2001 ; 0020 ; MA #* (   → ) EM QUAD → SPACE # +2002 ; 0020 ; MA #* (   → ) EN SPACE → SPACE # +2003 ; 0020 ; MA #* (   → ) EM SPACE → SPACE # +2004 ; 0020 ; MA #* (   → ) THREE-PER-EM SPACE → SPACE # +2005 ; 0020 ; MA #* (   → ) FOUR-PER-EM SPACE → SPACE # +2006 ; 0020 ; MA #* (   → ) SIX-PER-EM SPACE → SPACE # +2008 ; 0020 ; MA #* (   → ) PUNCTUATION SPACE → SPACE # +2009 ; 0020 ; MA #* (   → ) THIN SPACE → SPACE # +200A ; 0020 ; MA #* (   → ) HAIR SPACE → SPACE # +205F ; 0020 ; MA #* (   → ) MEDIUM MATHEMATICAL SPACE → SPACE # +00A0 ; 0020 ; MA #* (   → ) NO-BREAK SPACE → SPACE # +2007 ; 0020 ; MA #* (   → ) FIGURE SPACE → SPACE # +202F ; 0020 ; MA #* (   → ) NARROW NO-BREAK SPACE → SPACE # + +07FA ; 005F ; MA # ( ‎ߺ‎ → _ ) NKO LAJANYALAN → LOW LINE # +FE4D ; 005F ; MA # ( ﹍ → _ ) DASHED LOW LINE → LOW LINE # +FE4E ; 005F ; MA # ( ﹎ → _ ) CENTRELINE LOW LINE → LOW LINE # +FE4F ; 005F ; MA # ( ﹏ → _ ) WAVY LOW LINE → LOW LINE # + +2010 ; 002D ; MA #* ( ‐ → - ) HYPHEN → HYPHEN-MINUS # +2011 ; 002D ; MA #* ( ‑ → - ) NON-BREAKING HYPHEN → HYPHEN-MINUS # +2012 ; 002D ; MA #* ( ‒ → - ) FIGURE DASH → HYPHEN-MINUS # +2013 ; 002D ; MA #* ( – → - ) EN DASH → HYPHEN-MINUS # +FE58 ; 002D ; MA #* ( ﹘ → - ) SMALL EM DASH → HYPHEN-MINUS # +06D4 ; 002D ; MA #* ( ‎۔‎ → - ) ARABIC FULL STOP → HYPHEN-MINUS # →‐→ +2043 ; 002D ; MA #* ( ⁃ → - ) HYPHEN BULLET → HYPHEN-MINUS # →‐→ +02D7 ; 002D ; MA #* ( ˗ → - ) MODIFIER LETTER MINUS SIGN → HYPHEN-MINUS # +2212 ; 002D ; MA #* ( − → - ) MINUS SIGN → HYPHEN-MINUS # +2796 ; 002D ; MA #* ( ➖ → - ) HEAVY MINUS SIGN → HYPHEN-MINUS # →−→ +2CBB ; 002D ; MA # ( ⲻ → - ) COPTIC SMALL LETTER DIALECT-P NI → HYPHEN-MINUS # →−→ +2CBA ; 002D ; MA # ( Ⲻ → - ) COPTIC CAPITAL LETTER DIALECT-P NI → HYPHEN-MINUS # →‒→ + +2A29 ; 002D 0313 ; MA #* ( ⨩ → -̓ ) MINUS SIGN WITH COMMA ABOVE → HYPHEN-MINUS, COMBINING COMMA ABOVE # →−̓→ + +2E1A ; 002D 0308 ; MA #* ( ⸚ → -̈ ) HYPHEN WITH DIAERESIS → HYPHEN-MINUS, COMBINING DIAERESIS # + +FB29 ; 002D 0307 ; MA #* ( ﬩ → -̇ ) HEBREW LETTER ALTERNATIVE PLUS SIGN → HYPHEN-MINUS, COMBINING DOT ABOVE # →∸→→−̇→ +2238 ; 002D 0307 ; MA #* ( ∸ → -̇ ) DOT MINUS → HYPHEN-MINUS, COMBINING DOT ABOVE # →−̇→ +2CB3 ; 002D 0307 ; MA # ( ⲳ → -̇ ) COPTIC SMALL LETTER DIALECT-P ALEF → HYPHEN-MINUS, COMBINING DOT ABOVE # →﬩→→∸→→−̇→ +2CB2 ; 002D 0307 ; MA # ( Ⲳ → -̇ ) COPTIC CAPITAL LETTER DIALECT-P ALEF → HYPHEN-MINUS, COMBINING DOT ABOVE # →﬩→→∸→→−̇→ + +2A2A ; 002D 0323 ; MA #* ( ⨪ → -̣ ) MINUS SIGN WITH DOT BELOW → HYPHEN-MINUS, COMBINING DOT BELOW # →−̣→ + +A4FE ; 002D 002E ; MA #* ( ꓾ → -. ) LISU PUNCTUATION COMMA → HYPHEN-MINUS, FULL STOP # + +FF5E ; 301C ; MA #* ( ~ → 〜 ) FULLWIDTH TILDE → WAVE DASH # + +060D ; 002C ; MA #* ( ‎؍‎ → , ) ARABIC DATE SEPARATOR → COMMA # →‎٫‎→ +066B ; 002C ; MA #* ( ‎٫‎ → , ) ARABIC DECIMAL SEPARATOR → COMMA # +201A ; 002C ; MA #* ( ‚ → , ) SINGLE LOW-9 QUOTATION MARK → COMMA # +00B8 ; 002C ; MA #* ( ¸ → , ) CEDILLA → COMMA # +A4F9 ; 002C ; MA # ( ꓹ → , ) LISU LETTER TONE NA PO → COMMA # + +2E32 ; 060C ; MA #* ( ⸲ → ، ) TURNED COMMA → ARABIC COMMA # +066C ; 060C ; MA #* ( ‎٬‎ → ، ) ARABIC THOUSANDS SEPARATOR → ARABIC COMMA # + +037E ; 003B ; MA #* ( ; → ; ) GREEK QUESTION MARK → SEMICOLON # + +2E35 ; 061B ; MA #* ( ⸵ → ‎؛‎ ) TURNED SEMICOLON → ARABIC SEMICOLON # + +0903 ; 003A ; MA # ( ः → : ) DEVANAGARI SIGN VISARGA → COLON # +0A83 ; 003A ; MA # ( ઃ → : ) GUJARATI SIGN VISARGA → COLON # +FF1A ; 003A ; MA #* ( : → : ) FULLWIDTH COLON → COLON # →︰→ +0589 ; 003A ; MA #* ( ։ → : ) ARMENIAN FULL STOP → COLON # +0703 ; 003A ; MA #* ( ‎܃‎ → : ) SYRIAC SUPRALINEAR COLON → COLON # +0704 ; 003A ; MA #* ( ‎܄‎ → : ) SYRIAC SUBLINEAR COLON → COLON # +16EC ; 003A ; MA #* ( ᛬ → : ) RUNIC MULTIPLE PUNCTUATION → COLON # +FE30 ; 003A ; MA #* ( ︰ → : ) PRESENTATION FORM FOR VERTICAL TWO DOT LEADER → COLON # +1803 ; 003A ; MA #* ( ᠃ → : ) MONGOLIAN FULL STOP → COLON # +1809 ; 003A ; MA #* ( ᠉ → : ) MONGOLIAN MANCHU FULL STOP → COLON # +205A ; 003A ; MA #* ( ⁚ → : ) TWO DOT PUNCTUATION → COLON # +05C3 ; 003A ; MA #* ( ‎׃‎ → : ) HEBREW PUNCTUATION SOF PASUQ → COLON # +02F8 ; 003A ; MA #* ( ˸ → : ) MODIFIER LETTER RAISED COLON → COLON # +A789 ; 003A ; MA #* ( ꞉ → : ) MODIFIER LETTER COLON → COLON # +2236 ; 003A ; MA #* ( ∶ → : ) RATIO → COLON # +02D0 ; 003A ; MA # ( ː → : ) MODIFIER LETTER TRIANGULAR COLON → COLON # +A4FD ; 003A ; MA # ( ꓽ → : ) LISU LETTER TONE MYA JEU → COLON # +11DD9 ; 003A ; MA # ( 𑷙 → : ) TOLONG SIKI SIGN SELA → COLON # + +2A74 ; 003A 003A 003D ; MA #* ( ⩴ → ::= ) DOUBLE COLON EQUAL → COLON, COLON, EQUALS SIGN # + +29F4 ; 003A 2192 ; MA #* ( ⧴ → :→ ) RULE-DELAYED → COLON, RIGHTWARDS ARROW # + +FF01 ; 0021 ; MA #* ( ! → ! ) FULLWIDTH EXCLAMATION MARK → EXCLAMATION MARK # →ǃ→ +01C3 ; 0021 ; MA # ( ǃ → ! ) LATIN LETTER RETROFLEX CLICK → EXCLAMATION MARK # +2D51 ; 0021 ; MA # ( ⵑ → ! ) TIFINAGH LETTER TUAREG YANG → EXCLAMATION MARK # + +203C ; 0021 0021 ; MA #* ( ‼ → !! ) DOUBLE EXCLAMATION MARK → EXCLAMATION MARK, EXCLAMATION MARK # + +2049 ; 0021 003F ; MA #* ( ⁉ → !? ) EXCLAMATION QUESTION MARK → EXCLAMATION MARK, QUESTION MARK # + +0294 ; 003F ; MA # ( ʔ → ? ) LATIN LETTER GLOTTAL STOP → QUESTION MARK # +0241 ; 003F ; MA # ( Ɂ → ? ) LATIN CAPITAL LETTER GLOTTAL STOP → QUESTION MARK # →ʔ→ +097D ; 003F ; MA # ( ॽ → ? ) DEVANAGARI LETTER GLOTTAL STOP → QUESTION MARK # +13AE ; 003F ; MA # ( Ꭾ → ? ) CHEROKEE LETTER HE → QUESTION MARK # →Ɂ→→ʔ→ +A6EB ; 003F ; MA # ( ꛫ → ? ) BAMUM LETTER NTUU → QUESTION MARK # →ʔ→ + +2048 ; 003F 0021 ; MA #* ( ⁈ → ?! ) QUESTION EXCLAMATION MARK → QUESTION MARK, EXCLAMATION MARK # + +2047 ; 003F 003F ; MA #* ( ⁇ → ?? ) DOUBLE QUESTION MARK → QUESTION MARK, QUESTION MARK # + +2E2E ; 061F ; MA #* ( ⸮ → ‎؟‎ ) REVERSED QUESTION MARK → ARABIC QUESTION MARK # + +1D16D ; 002E ; MA # ( 𝅭 → . ) MUSICAL SYMBOL COMBINING AUGMENTATION DOT → FULL STOP # +2024 ; 002E ; MA #* ( ․ → . ) ONE DOT LEADER → FULL STOP # +0701 ; 002E ; MA #* ( ‎܁‎ → . ) SYRIAC SUPRALINEAR FULL STOP → FULL STOP # +0702 ; 002E ; MA #* ( ‎܂‎ → . ) SYRIAC SUBLINEAR FULL STOP → FULL STOP # +A60E ; 002E ; MA #* ( ꘎ → . ) VAI FULL STOP → FULL STOP # +10A50 ; 002E ; MA #* ( ‎𐩐‎ → . ) KHAROSHTHI PUNCTUATION DOT → FULL STOP # +0660 ; 002E ; MA # ( ‎٠‎ → . ) ARABIC-INDIC DIGIT ZERO → FULL STOP # +06F0 ; 002E ; MA # ( ۰ → . ) EXTENDED ARABIC-INDIC DIGIT ZERO → FULL STOP # →‎٠‎→ +A4F8 ; 002E ; MA # ( ꓸ → . ) LISU LETTER TONE MYA TI → FULL STOP # + +A4FB ; 002E 002C ; MA # ( ꓻ → ., ) LISU LETTER TONE MYA BO → FULL STOP, COMMA # + +2025 ; 002E 002E ; MA #* ( ‥ → .. ) TWO DOT LEADER → FULL STOP, FULL STOP # +A4FA ; 002E 002E ; MA # ( ꓺ → .. ) LISU LETTER TONE MYA CYA → FULL STOP, FULL STOP # + +2026 ; 002E 002E 002E ; MA #* ( … → ... ) HORIZONTAL ELLIPSIS → FULL STOP, FULL STOP, FULL STOP # + +A6F4 ; A6F3 A6F3 ; MA #* ( ꛴ → ꛳꛳ ) BAMUM COLON → BAMUM FULL STOP, BAMUM FULL STOP # + +30FB ; 00B7 ; MA # ( ・ → · ) KATAKANA MIDDLE DOT → MIDDLE DOT # →•→ +FF65 ; 00B7 ; MA # ( ・ → · ) HALFWIDTH KATAKANA MIDDLE DOT → MIDDLE DOT # →•→ +16EB ; 00B7 ; MA #* ( ᛫ → · ) RUNIC SINGLE PUNCTUATION → MIDDLE DOT # +0387 ; 00B7 ; MA # ( · → · ) GREEK ANO TELEIA → MIDDLE DOT # +2E31 ; 00B7 ; MA #* ( ⸱ → · ) WORD SEPARATOR MIDDLE DOT → MIDDLE DOT # +10101 ; 00B7 ; MA #* ( 𐄁 → · ) AEGEAN WORD SEPARATOR DOT → MIDDLE DOT # +2022 ; 00B7 ; MA #* ( • → · ) BULLET → MIDDLE DOT # +2027 ; 00B7 ; MA #* ( ‧ → · ) HYPHENATION POINT → MIDDLE DOT # +2219 ; 00B7 ; MA #* ( ∙ → · ) BULLET OPERATOR → MIDDLE DOT # +22C5 ; 00B7 ; MA #* ( ⋅ → · ) DOT OPERATOR → MIDDLE DOT # +A78F ; 00B7 ; MA # ( ꞏ → · ) LATIN LETTER SINOLOGICAL DOT → MIDDLE DOT # +1427 ; 00B7 ; MA # ( ᐧ → · ) CANADIAN SYLLABICS FINAL MIDDLE DOT → MIDDLE DOT # + +22EF ; 00B7 00B7 00B7 ; MA #* ( ⋯ → ··· ) MIDLINE HORIZONTAL ELLIPSIS → MIDDLE DOT, MIDDLE DOT, MIDDLE DOT # +2D48 ; 00B7 00B7 00B7 ; MA # ( ⵈ → ··· ) TIFINAGH LETTER TUAREG YAQ → MIDDLE DOT, MIDDLE DOT, MIDDLE DOT # →⋯→ + +1444 ; 00B7 003C ; MA # ( ᑄ → ·< ) CANADIAN SYLLABICS PWA → MIDDLE DOT, LESS-THAN SIGN # →ᐧᐸ→ + +22D7 ; 00B7 003E ; MA #* ( ⋗ → ·> ) GREATER-THAN WITH DOT → MIDDLE DOT, GREATER-THAN SIGN # →ᑀ→→ᐧᐳ→ +1437 ; 00B7 003E ; MA # ( ᐷ → ·> ) CANADIAN SYLLABICS CARRIER HI → MIDDLE DOT, GREATER-THAN SIGN # →ᑀ→→ᐧᐳ→ +1440 ; 00B7 003E ; MA # ( ᑀ → ·> ) CANADIAN SYLLABICS PWO → MIDDLE DOT, GREATER-THAN SIGN # →ᐧᐳ→ + +152F ; 00B7 0034 ; MA # ( ᔯ → ·4 ) CANADIAN SYLLABICS YWE → MIDDLE DOT, DIGIT FOUR # →ᐧ4→ + +147E ; 00B7 0062 ; MA # ( ᑾ → ·b ) CANADIAN SYLLABICS KWA → MIDDLE DOT, LATIN SMALL LETTER B # →ᐧᑲ→ + +1480 ; 00B7 0062 0307 ; MA # ( ᒀ → ·ḃ ) CANADIAN SYLLABICS KWAA → MIDDLE DOT, LATIN SMALL LETTER B, COMBINING DOT ABOVE # →ᐧᑳ→ + +147A ; 00B7 0064 ; MA # ( ᑺ → ·d ) CANADIAN SYLLABICS KWO → MIDDLE DOT, LATIN SMALL LETTER D # →ᐧᑯ→ + +1498 ; 00B7 004A ; MA # ( ᒘ → ·J ) CANADIAN SYLLABICS CWO → MIDDLE DOT, LATIN CAPITAL LETTER J # →ᐧᒍ→ + +14B6 ; 00B7 004C ; MA # ( ᒶ → ·L ) CANADIAN SYLLABICS MWA → MIDDLE DOT, LATIN CAPITAL LETTER L # →ᐧL→ + +1476 ; 00B7 0050 ; MA # ( ᑶ → ·P ) CANADIAN SYLLABICS KWI → MIDDLE DOT, LATIN CAPITAL LETTER P # →ᐧᑭ→ + +1457 ; 00B7 0055 ; MA # ( ᑗ → ·U ) CANADIAN SYLLABICS TWE → MIDDLE DOT, LATIN CAPITAL LETTER U # →ᐧᑌ→→·ᑌ→ + +143A ; 00B7 0056 ; MA # ( ᐺ → ·V ) CANADIAN SYLLABICS PWE → MIDDLE DOT, LATIN CAPITAL LETTER V # →ᐧᐯ→ + +143C ; 00B7 0245 ; MA # ( ᐼ → ·Ʌ ) CANADIAN SYLLABICS PWI → MIDDLE DOT, LATIN CAPITAL LETTER TURNED V # →ᐧᐱ→→·ᐱ→ + +14AE ; 00B7 0393 ; MA # ( ᒮ → ·Γ ) CANADIAN SYLLABICS MWI → MIDDLE DOT, GREEK CAPITAL LETTER GAMMA # →ᐧᒥ→→·ᒥ→ + +140E ; 00B7 0394 ; MA # ( ᐎ → ·Δ ) CANADIAN SYLLABICS WI → MIDDLE DOT, GREEK CAPITAL LETTER DELTA # →ᐧᐃ→ + +1459 ; 00B7 0548 ; MA # ( ᑙ → ·Ո ) CANADIAN SYLLABICS TWI → MIDDLE DOT, ARMENIAN CAPITAL LETTER VO # →ᐧᑎ→→·ᑎ→ + +140C ; 00B7 1401 ; MA # ( ᐌ → ·ᐁ ) CANADIAN SYLLABICS WE → MIDDLE DOT, CANADIAN SYLLABICS E # →ᐧᐁ→ + +1410 ; 00B7 1404 ; MA # ( ᐐ → ·ᐄ ) CANADIAN SYLLABICS WII → MIDDLE DOT, CANADIAN SYLLABICS II # →ᐧᐄ→ + +1412 ; 00B7 1405 ; MA # ( ᐒ → ·ᐅ ) CANADIAN SYLLABICS WO → MIDDLE DOT, CANADIAN SYLLABICS O # →ᐧᐅ→ + +1414 ; 00B7 1406 ; MA # ( ᐔ → ·ᐆ ) CANADIAN SYLLABICS WOO → MIDDLE DOT, CANADIAN SYLLABICS OO # →ᐧᐆ→ + +1417 ; 00B7 140A ; MA # ( ᐗ → ·ᐊ ) CANADIAN SYLLABICS WA → MIDDLE DOT, CANADIAN SYLLABICS A # →ᐧᐊ→ + +1419 ; 00B7 140B ; MA # ( ᐙ → ·ᐋ ) CANADIAN SYLLABICS WAA → MIDDLE DOT, CANADIAN SYLLABICS AA # →ᐧᐋ→ + +143E ; 00B7 1432 ; MA # ( ᐾ → ·ᐲ ) CANADIAN SYLLABICS PWII → MIDDLE DOT, CANADIAN SYLLABICS PII # →ᐧᐲ→ + +1442 ; 00B7 1434 ; MA # ( ᑂ → ·ᐴ ) CANADIAN SYLLABICS PWOO → MIDDLE DOT, CANADIAN SYLLABICS POO # →ᐧᐴ→ + +1446 ; 00B7 1439 ; MA # ( ᑆ → ·ᐹ ) CANADIAN SYLLABICS PWAA → MIDDLE DOT, CANADIAN SYLLABICS PAA # →ᐧᐹ→ + +145B ; 00B7 144F ; MA # ( ᑛ → ·ᑏ ) CANADIAN SYLLABICS TWII → MIDDLE DOT, CANADIAN SYLLABICS TII # →ᐧᑏ→ + +1454 ; 00B7 1450 ; MA # ( ᑔ → ·ᑐ ) CANADIAN SYLLABICS CARRIER DI → MIDDLE DOT, CANADIAN SYLLABICS TO # →ᑝ→→ᐧᑐ→ +145D ; 00B7 1450 ; MA # ( ᑝ → ·ᑐ ) CANADIAN SYLLABICS TWO → MIDDLE DOT, CANADIAN SYLLABICS TO # →ᐧᑐ→ + +145F ; 00B7 1451 ; MA # ( ᑟ → ·ᑑ ) CANADIAN SYLLABICS TWOO → MIDDLE DOT, CANADIAN SYLLABICS TOO # →ᐧᑑ→ + +1461 ; 00B7 1455 ; MA # ( ᑡ → ·ᑕ ) CANADIAN SYLLABICS TWA → MIDDLE DOT, CANADIAN SYLLABICS TA # →ᐧᑕ→ + +1463 ; 00B7 1456 ; MA # ( ᑣ → ·ᑖ ) CANADIAN SYLLABICS TWAA → MIDDLE DOT, CANADIAN SYLLABICS TAA # →ᐧᑖ→ + +1474 ; 00B7 146B ; MA # ( ᑴ → ·ᑫ ) CANADIAN SYLLABICS KWE → MIDDLE DOT, CANADIAN SYLLABICS KE # →ᐧᑫ→ + +1478 ; 00B7 146E ; MA # ( ᑸ → ·ᑮ ) CANADIAN SYLLABICS KWII → MIDDLE DOT, CANADIAN SYLLABICS KII # →ᐧᑮ→ + +147C ; 00B7 1470 ; MA # ( ᑼ → ·ᑰ ) CANADIAN SYLLABICS KWOO → MIDDLE DOT, CANADIAN SYLLABICS KOO # →ᐧᑰ→ + +1492 ; 00B7 1489 ; MA # ( ᒒ → ·ᒉ ) CANADIAN SYLLABICS CWE → MIDDLE DOT, CANADIAN SYLLABICS CE # →ᐧᒉ→ + +1494 ; 00B7 148B ; MA # ( ᒔ → ·ᒋ ) CANADIAN SYLLABICS CWI → MIDDLE DOT, CANADIAN SYLLABICS CI # →ᐧᒋ→ + +1496 ; 00B7 148C ; MA # ( ᒖ → ·ᒌ ) CANADIAN SYLLABICS CWII → MIDDLE DOT, CANADIAN SYLLABICS CII # →ᐧᒌ→ + +149A ; 00B7 148E ; MA # ( ᒚ → ·ᒎ ) CANADIAN SYLLABICS CWOO → MIDDLE DOT, CANADIAN SYLLABICS COO # →ᐧᒎ→ + +149C ; 00B7 1490 ; MA # ( ᒜ → ·ᒐ ) CANADIAN SYLLABICS CWA → MIDDLE DOT, CANADIAN SYLLABICS CA # →ᐧᒐ→ + +149E ; 00B7 1491 ; MA # ( ᒞ → ·ᒑ ) CANADIAN SYLLABICS CWAA → MIDDLE DOT, CANADIAN SYLLABICS CAA # →ᐧᒑ→ + +14AC ; 00B7 14A3 ; MA # ( ᒬ → ·ᒣ ) CANADIAN SYLLABICS MWE → MIDDLE DOT, CANADIAN SYLLABICS ME # →ᐧᒣ→ + +14B0 ; 00B7 14A6 ; MA # ( ᒰ → ·ᒦ ) CANADIAN SYLLABICS MWII → MIDDLE DOT, CANADIAN SYLLABICS MII # →ᐧᒦ→ + +14B2 ; 00B7 14A7 ; MA # ( ᒲ → ·ᒧ ) CANADIAN SYLLABICS MWO → MIDDLE DOT, CANADIAN SYLLABICS MO # →ᐧᒧ→ + +14B4 ; 00B7 14A8 ; MA # ( ᒴ → ·ᒨ ) CANADIAN SYLLABICS MWOO → MIDDLE DOT, CANADIAN SYLLABICS MOO # →ᐧᒨ→ + +14B8 ; 00B7 14AB ; MA # ( ᒸ → ·ᒫ ) CANADIAN SYLLABICS MWAA → MIDDLE DOT, CANADIAN SYLLABICS MAA # →ᐧᒫ→ + +14C9 ; 00B7 14C0 ; MA # ( ᓉ → ·ᓀ ) CANADIAN SYLLABICS NWE → MIDDLE DOT, CANADIAN SYLLABICS NE # →ᐧᓀ→ + +18C6 ; 00B7 14C2 ; MA # ( ᣆ → ·ᓂ ) CANADIAN SYLLABICS NWI → MIDDLE DOT, CANADIAN SYLLABICS NI # →ᐧᓂ→ + +18C8 ; 00B7 14C3 ; MA # ( ᣈ → ·ᓃ ) CANADIAN SYLLABICS NWII → MIDDLE DOT, CANADIAN SYLLABICS NII # →ᐧᓃ→ + +18CA ; 00B7 14C4 ; MA # ( ᣊ → ·ᓄ ) CANADIAN SYLLABICS NWO → MIDDLE DOT, CANADIAN SYLLABICS NO # →ᐧᓄ→ + +18CC ; 00B7 14C5 ; MA # ( ᣌ → ·ᓅ ) CANADIAN SYLLABICS NWOO → MIDDLE DOT, CANADIAN SYLLABICS NOO # →ᐧᓅ→ + +14CB ; 00B7 14C7 ; MA # ( ᓋ → ·ᓇ ) CANADIAN SYLLABICS NWA → MIDDLE DOT, CANADIAN SYLLABICS NA # →ᐧᓇ→ + +14CD ; 00B7 14C8 ; MA # ( ᓍ → ·ᓈ ) CANADIAN SYLLABICS NWAA → MIDDLE DOT, CANADIAN SYLLABICS NAA # →ᐧᓈ→ + +14DC ; 00B7 14D3 ; MA # ( ᓜ → ·ᓓ ) CANADIAN SYLLABICS LWE → MIDDLE DOT, CANADIAN SYLLABICS LE # →ᐧᓓ→ + +14DE ; 00B7 14D5 ; MA # ( ᓞ → ·ᓕ ) CANADIAN SYLLABICS LWI → MIDDLE DOT, CANADIAN SYLLABICS LI # →ᐧᓕ→ + +14E0 ; 00B7 14D6 ; MA # ( ᓠ → ·ᓖ ) CANADIAN SYLLABICS LWII → MIDDLE DOT, CANADIAN SYLLABICS LII # →ᐧᓖ→ + +14E2 ; 00B7 14D7 ; MA # ( ᓢ → ·ᓗ ) CANADIAN SYLLABICS LWO → MIDDLE DOT, CANADIAN SYLLABICS LO # →ᐧᓗ→ + +14E4 ; 00B7 14D8 ; MA # ( ᓤ → ·ᓘ ) CANADIAN SYLLABICS LWOO → MIDDLE DOT, CANADIAN SYLLABICS LOO # →ᐧᓘ→ + +14E6 ; 00B7 14DA ; MA # ( ᓦ → ·ᓚ ) CANADIAN SYLLABICS LWA → MIDDLE DOT, CANADIAN SYLLABICS LA # →ᐧᓚ→ + +14E8 ; 00B7 14DB ; MA # ( ᓨ → ·ᓛ ) CANADIAN SYLLABICS LWAA → MIDDLE DOT, CANADIAN SYLLABICS LAA # →ᐧᓛ→ + +14F6 ; 00B7 14ED ; MA # ( ᓶ → ·ᓭ ) CANADIAN SYLLABICS SWE → MIDDLE DOT, CANADIAN SYLLABICS SE # →ᐧᓭ→ + +14F8 ; 00B7 14EF ; MA # ( ᓸ → ·ᓯ ) CANADIAN SYLLABICS SWI → MIDDLE DOT, CANADIAN SYLLABICS SI # →ᐧᓯ→ + +14FA ; 00B7 14F0 ; MA # ( ᓺ → ·ᓰ ) CANADIAN SYLLABICS SWII → MIDDLE DOT, CANADIAN SYLLABICS SII # →ᐧᓰ→ + +14FC ; 00B7 14F1 ; MA # ( ᓼ → ·ᓱ ) CANADIAN SYLLABICS SWO → MIDDLE DOT, CANADIAN SYLLABICS SO # →ᐧᓱ→ + +14FE ; 00B7 14F2 ; MA # ( ᓾ → ·ᓲ ) CANADIAN SYLLABICS SWOO → MIDDLE DOT, CANADIAN SYLLABICS SOO # →ᐧᓲ→ + +1500 ; 00B7 14F4 ; MA # ( ᔀ → ·ᓴ ) CANADIAN SYLLABICS SWA → MIDDLE DOT, CANADIAN SYLLABICS SA # →ᐧᓴ→ + +1502 ; 00B7 14F5 ; MA # ( ᔂ → ·ᓵ ) CANADIAN SYLLABICS SWAA → MIDDLE DOT, CANADIAN SYLLABICS SAA # →ᐧᓵ→ + +1517 ; 00B7 1510 ; MA # ( ᔗ → ·ᔐ ) CANADIAN SYLLABICS SHWE → MIDDLE DOT, CANADIAN SYLLABICS SHE # →ᐧᔐ→ + +1519 ; 00B7 1511 ; MA # ( ᔙ → ·ᔑ ) CANADIAN SYLLABICS SHWI → MIDDLE DOT, CANADIAN SYLLABICS SHI # →ᐧᔑ→ + +151B ; 00B7 1512 ; MA # ( ᔛ → ·ᔒ ) CANADIAN SYLLABICS SHWII → MIDDLE DOT, CANADIAN SYLLABICS SHII # →ᐧᔒ→ + +151D ; 00B7 1513 ; MA # ( ᔝ → ·ᔓ ) CANADIAN SYLLABICS SHWO → MIDDLE DOT, CANADIAN SYLLABICS SHO # →ᐧᔓ→ + +151F ; 00B7 1514 ; MA # ( ᔟ → ·ᔔ ) CANADIAN SYLLABICS SHWOO → MIDDLE DOT, CANADIAN SYLLABICS SHOO # →ᐧᔔ→ + +1521 ; 00B7 1515 ; MA # ( ᔡ → ·ᔕ ) CANADIAN SYLLABICS SHWA → MIDDLE DOT, CANADIAN SYLLABICS SHA # →ᐧᔕ→ + +1523 ; 00B7 1516 ; MA # ( ᔣ → ·ᔖ ) CANADIAN SYLLABICS SHWAA → MIDDLE DOT, CANADIAN SYLLABICS SHAA # →ᐧᔖ→ + +1531 ; 00B7 1528 ; MA # ( ᔱ → ·ᔨ ) CANADIAN SYLLABICS YWI → MIDDLE DOT, CANADIAN SYLLABICS YI # →ᐧᔨ→ + +1533 ; 00B7 1529 ; MA # ( ᔳ → ·ᔩ ) CANADIAN SYLLABICS YWII → MIDDLE DOT, CANADIAN SYLLABICS YII # →ᐧᔩ→ + +1535 ; 00B7 152A ; MA # ( ᔵ → ·ᔪ ) CANADIAN SYLLABICS YWO → MIDDLE DOT, CANADIAN SYLLABICS YO # →ᐧᔪ→ + +1537 ; 00B7 152B ; MA # ( ᔷ → ·ᔫ ) CANADIAN SYLLABICS YWOO → MIDDLE DOT, CANADIAN SYLLABICS YOO # →ᐧᔫ→ + +1539 ; 00B7 152D ; MA # ( ᔹ → ·ᔭ ) CANADIAN SYLLABICS YWA → MIDDLE DOT, CANADIAN SYLLABICS YA # →ᐧᔭ→ + +153B ; 00B7 152E ; MA # ( ᔻ → ·ᔮ ) CANADIAN SYLLABICS YWAA → MIDDLE DOT, CANADIAN SYLLABICS YAA # →ᐧᔮ→ + +18CE ; 00B7 1543 ; MA # ( ᣎ → ·ᕃ ) CANADIAN SYLLABICS RWEE → MIDDLE DOT, CANADIAN SYLLABICS R-CREE RE # →ᐧᕃ→ + +18CF ; 00B7 1546 ; MA # ( ᣏ → ·ᕆ ) CANADIAN SYLLABICS RWI → MIDDLE DOT, CANADIAN SYLLABICS RI # →ᐧᕆ→ + +18D0 ; 00B7 1547 ; MA # ( ᣐ → ·ᕇ ) CANADIAN SYLLABICS RWII → MIDDLE DOT, CANADIAN SYLLABICS RII # →ᐧᕇ→ + +18D1 ; 00B7 1548 ; MA # ( ᣑ → ·ᕈ ) CANADIAN SYLLABICS RWO → MIDDLE DOT, CANADIAN SYLLABICS RO # →ᐧᕈ→ + +18D2 ; 00B7 1549 ; MA # ( ᣒ → ·ᕉ ) CANADIAN SYLLABICS RWOO → MIDDLE DOT, CANADIAN SYLLABICS ROO # →ᐧᕉ→ + +18D3 ; 00B7 154B ; MA # ( ᣓ → ·ᕋ ) CANADIAN SYLLABICS RWA → MIDDLE DOT, CANADIAN SYLLABICS RA # →ᐧᕋ→ + +154E ; 00B7 154C ; MA # ( ᕎ → ·ᕌ ) CANADIAN SYLLABICS RWAA → MIDDLE DOT, CANADIAN SYLLABICS RAA # →ᐧᕌ→ + +155B ; 00B7 155A ; MA # ( ᕛ → ·ᕚ ) CANADIAN SYLLABICS FWAA → MIDDLE DOT, CANADIAN SYLLABICS FAA # →ᐧᕚ→ + +1568 ; 00B7 1567 ; MA # ( ᕨ → ·ᕧ ) CANADIAN SYLLABICS THWAA → MIDDLE DOT, CANADIAN SYLLABICS THAA # →ᐧᕧ→ + +18B3 ; 00B7 18B1 ; MA # ( ᢳ → ·ᢱ ) CANADIAN SYLLABICS WAY → MIDDLE DOT, CANADIAN SYLLABICS AY # →ᐧᢱ→ + +18B6 ; 00B7 18B4 ; MA # ( ᢶ → ·ᢴ ) CANADIAN SYLLABICS PWOY → MIDDLE DOT, CANADIAN SYLLABICS POY # →ᐧᢴ→ + +18B9 ; 00B7 18B8 ; MA # ( ᢹ → ·ᢸ ) CANADIAN SYLLABICS KWAY → MIDDLE DOT, CANADIAN SYLLABICS KAY # →ᐧᢸ→ + +18C2 ; 00B7 18C0 ; MA # ( ᣂ → ·ᣀ ) CANADIAN SYLLABICS SHWOY → MIDDLE DOT, CANADIAN SYLLABICS SHOY # →ᐧᣀ→ + +A830 ; 0964 ; MA #* ( ꠰ → । ) NORTH INDIC FRACTION ONE QUARTER → DEVANAGARI DANDA # + +0965 ; 0964 0964 ; MA #* ( ॥ → ।। ) DEVANAGARI DOUBLE DANDA → DEVANAGARI DANDA, DEVANAGARI DANDA # + +1C3C ; 1C3B 1C3B ; MA #* ( ᰼ → ᰻᰻ ) LEPCHA PUNCTUATION NYET THYOOM TA-ROL → LEPCHA PUNCTUATION TA-ROL, LEPCHA PUNCTUATION TA-ROL # + +104B ; 104A 104A ; MA #* ( ။ → ၊၊ ) MYANMAR SIGN SECTION → MYANMAR SIGN LITTLE SECTION, MYANMAR SIGN LITTLE SECTION # + +1AA9 ; 1AA8 1AA8 ; MA #* ( ᪩ → ᪨᪨ ) TAI THAM SIGN KAANKUU → TAI THAM SIGN KAAN, TAI THAM SIGN KAAN # + +1AAB ; 1AAA 1AA8 ; MA #* ( ᪫ → ᪪᪨ ) TAI THAM SIGN SATKAANKUU → TAI THAM SIGN SATKAAN, TAI THAM SIGN KAAN # + +1B5F ; 1B5E 1B5E ; MA #* ( ᭟ → ᭞᭞ ) BALINESE CARIK PAREREN → BALINESE CARIK SIKI, BALINESE CARIK SIKI # + +10A57 ; 10A56 10A56 ; MA #* ( ‎𐩗‎ → ‎𐩖𐩖‎ ) KHAROSHTHI PUNCTUATION DOUBLE DANDA → KHAROSHTHI PUNCTUATION DANDA, KHAROSHTHI PUNCTUATION DANDA # + +1144C ; 1144B 1144B ; MA #* ( 𑑌 → 𑑋𑑋 ) NEWA DOUBLE DANDA → NEWA DANDA, NEWA DANDA # + +11642 ; 11641 11641 ; MA #* ( 𑙂 → 𑙁𑙁 ) MODI DOUBLE DANDA → MODI DANDA, MODI DANDA # + +11C42 ; 11C41 11C41 ; MA #* ( 𑱂 → 𑱁𑱁 ) BHAIKSUKI DOUBLE DANDA → BHAIKSUKI DANDA, BHAIKSUKI DANDA # + +1C7F ; 1C7E 1C7E ; MA #* ( ᱿ → ᱾᱾ ) OL CHIKI PUNCTUATION DOUBLE MUCAAD → OL CHIKI PUNCTUATION MUCAAD, OL CHIKI PUNCTUATION MUCAAD # + +055D ; 0027 ; MA #* ( ՝ → ' ) ARMENIAN COMMA → APOSTROPHE # →ˋ→→`→→‘→ +FF07 ; 0027 ; MA #* ( ' → ' ) FULLWIDTH APOSTROPHE → APOSTROPHE # →’→ +2018 ; 0027 ; MA #* ( ‘ → ' ) LEFT SINGLE QUOTATION MARK → APOSTROPHE # +2019 ; 0027 ; MA #* ( ’ → ' ) RIGHT SINGLE QUOTATION MARK → APOSTROPHE # +201B ; 0027 ; MA #* ( ‛ → ' ) SINGLE HIGH-REVERSED-9 QUOTATION MARK → APOSTROPHE # →′→ +05F3 ; 0027 ; MA #* ( ‎׳‎ → ' ) HEBREW PUNCTUATION GERESH → APOSTROPHE # +2032 ; 0027 ; MA #* ( ′ → ' ) PRIME → APOSTROPHE # +2035 ; 0027 ; MA #* ( ‵ → ' ) REVERSED PRIME → APOSTROPHE # →ʽ→→‘→ +055A ; 0027 ; MA #* ( ՚ → ' ) ARMENIAN APOSTROPHE → APOSTROPHE # →’→ +0060 ; 0027 ; MA #* ( ` → ' ) GRAVE ACCENT → APOSTROPHE # →ˋ→→`→→‘→ +1FEF ; 0027 ; MA #* ( ` → ' ) GREEK VARIA → APOSTROPHE # →ˋ→→`→→‘→ +FF40 ; 0027 ; MA #* ( ` → ' ) FULLWIDTH GRAVE ACCENT → APOSTROPHE # →‘→ +00B4 ; 0027 ; MA #* ( ´ → ' ) ACUTE ACCENT → APOSTROPHE # →΄→→ʹ→ +0384 ; 0027 ; MA #* ( ΄ → ' ) GREEK TONOS → APOSTROPHE # →ʹ→ +1FFD ; 0027 ; MA #* ( ´ → ' ) GREEK OXIA → APOSTROPHE # →´→→΄→→ʹ→ +1FBD ; 0027 ; MA #* ( ᾽ → ' ) GREEK KORONIS → APOSTROPHE # →’→ +1FBF ; 0027 ; MA #* ( ᾿ → ' ) GREEK PSILI → APOSTROPHE # →’→ +1FFE ; 0027 ; MA #* ( ῾ → ' ) GREEK DASIA → APOSTROPHE # →‛→→′→ +02B9 ; 0027 ; MA # ( ʹ → ' ) MODIFIER LETTER PRIME → APOSTROPHE # +0374 ; 0027 ; MA # ( ʹ → ' ) GREEK NUMERAL SIGN → APOSTROPHE # →′→ +02C8 ; 0027 ; MA # ( ˈ → ' ) MODIFIER LETTER VERTICAL LINE → APOSTROPHE # +02CA ; 0027 ; MA # ( ˊ → ' ) MODIFIER LETTER ACUTE ACCENT → APOSTROPHE # →΄→→ʹ→ +02CB ; 0027 ; MA # ( ˋ → ' ) MODIFIER LETTER GRAVE ACCENT → APOSTROPHE # →`→→‘→ +02F4 ; 0027 ; MA #* ( ˴ → ' ) MODIFIER LETTER MIDDLE GRAVE ACCENT → APOSTROPHE # →ˋ→→`→→‘→ +02BB ; 0027 ; MA # ( ʻ → ' ) MODIFIER LETTER TURNED COMMA → APOSTROPHE # →‘→ +02BD ; 0027 ; MA # ( ʽ → ' ) MODIFIER LETTER REVERSED COMMA → APOSTROPHE # →‘→ +02BC ; 0027 ; MA # ( ʼ → ' ) MODIFIER LETTER APOSTROPHE → APOSTROPHE # →′→ +02BE ; 0027 ; MA # ( ʾ → ' ) MODIFIER LETTER RIGHT HALF RING → APOSTROPHE # →ʼ→→′→ +A78C ; 0027 ; MA # ( ꞌ → ' ) LATIN SMALL LETTER SALTILLO → APOSTROPHE # +05D9 ; 0027 ; MA # ( ‎י‎ → ' ) HEBREW LETTER YOD → APOSTROPHE # +07F4 ; 0027 ; MA # ( ‎ߴ‎ → ' ) NKO HIGH TONE APOSTROPHE → APOSTROPHE # →’→ +07F5 ; 0027 ; MA # ( ‎ߵ‎ → ' ) NKO LOW TONE APOSTROPHE → APOSTROPHE # →‘→ +144A ; 0027 ; MA # ( ᑊ → ' ) CANADIAN SYLLABICS WEST-CREE P → APOSTROPHE # →ˈ→ +16CC ; 0027 ; MA # ( ᛌ → ' ) RUNIC LETTER SHORT-TWIG-SOL S → APOSTROPHE # +16F51 ; 0027 ; MA # ( 𖽑 → ' ) MIAO SIGN ASPIRATION → APOSTROPHE # →ʼ→→′→ +16F52 ; 0027 ; MA # ( 𖽒 → ' ) MIAO SIGN REFORMED VOICING → APOSTROPHE # →ʻ→→‘→ + +1CD3 ; 0027 0027 ; MA #* ( ᳓ → '' ) VEDIC SIGN NIHSHVASA → APOSTROPHE, APOSTROPHE # →″→→"→ +0022 ; 0027 0027 ; MA #* ( " → '' ) QUOTATION MARK → APOSTROPHE, APOSTROPHE # +FF02 ; 0027 0027 ; MA #* ( " → '' ) FULLWIDTH QUOTATION MARK → APOSTROPHE, APOSTROPHE # →”→→"→ +201C ; 0027 0027 ; MA #* ( “ → '' ) LEFT DOUBLE QUOTATION MARK → APOSTROPHE, APOSTROPHE # →"→ +201D ; 0027 0027 ; MA #* ( ” → '' ) RIGHT DOUBLE QUOTATION MARK → APOSTROPHE, APOSTROPHE # →"→ +201F ; 0027 0027 ; MA #* ( ‟ → '' ) DOUBLE HIGH-REVERSED-9 QUOTATION MARK → APOSTROPHE, APOSTROPHE # →“→→"→ +05F4 ; 0027 0027 ; MA #* ( ‎״‎ → '' ) HEBREW PUNCTUATION GERSHAYIM → APOSTROPHE, APOSTROPHE # →"→ +2033 ; 0027 0027 ; MA #* ( ″ → '' ) DOUBLE PRIME → APOSTROPHE, APOSTROPHE # →"→ +2036 ; 0027 0027 ; MA #* ( ‶ → '' ) REVERSED DOUBLE PRIME → APOSTROPHE, APOSTROPHE # →‵‵→ +3003 ; 0027 0027 ; MA #* ( 〃 → '' ) DITTO MARK → APOSTROPHE, APOSTROPHE # →″→→"→ +02DD ; 0027 0027 ; MA #* ( ˝ → '' ) DOUBLE ACUTE ACCENT → APOSTROPHE, APOSTROPHE # →"→ +02BA ; 0027 0027 ; MA # ( ʺ → '' ) MODIFIER LETTER DOUBLE PRIME → APOSTROPHE, APOSTROPHE # →"→ +02F6 ; 0027 0027 ; MA #* ( ˶ → '' ) MODIFIER LETTER MIDDLE DOUBLE ACUTE ACCENT → APOSTROPHE, APOSTROPHE # →˝→→"→ +02EE ; 0027 0027 ; MA # ( ˮ → '' ) MODIFIER LETTER DOUBLE APOSTROPHE → APOSTROPHE, APOSTROPHE # →″→→"→ +05F2 ; 0027 0027 ; MA # ( ‎ײ‎ → '' ) HEBREW LIGATURE YIDDISH DOUBLE YOD → APOSTROPHE, APOSTROPHE # →‎יי‎→ + +2034 ; 0027 0027 0027 ; MA #* ( ‴ → ''' ) TRIPLE PRIME → APOSTROPHE, APOSTROPHE, APOSTROPHE # →′′′→ +2037 ; 0027 0027 0027 ; MA #* ( ‷ → ''' ) REVERSED TRIPLE PRIME → APOSTROPHE, APOSTROPHE, APOSTROPHE # →‵‵‵→ + +2057 ; 0027 0027 0027 0027 ; MA #* ( ⁗ → '''' ) QUADRUPLE PRIME → APOSTROPHE, APOSTROPHE, APOSTROPHE, APOSTROPHE # →′′′′→ + +0181 ; 0027 0042 ; MA # ( Ɓ → 'B ) LATIN CAPITAL LETTER B WITH HOOK → APOSTROPHE, LATIN CAPITAL LETTER B # →ʽB→ + +018A ; 0027 0044 ; MA # ( Ɗ → 'D ) LATIN CAPITAL LETTER D WITH HOOK → APOSTROPHE, LATIN CAPITAL LETTER D # →ʽD→ + +0149 ; 0027 006E ; MA # ( ʼn → 'n ) LATIN SMALL LETTER N PRECEDED BY APOSTROPHE → APOSTROPHE, LATIN SMALL LETTER N # →ʼn→ + +01A4 ; 0027 0050 ; MA # ( Ƥ → 'P ) LATIN CAPITAL LETTER P WITH HOOK → APOSTROPHE, LATIN CAPITAL LETTER P # →ʽP→ + +01AC ; 0027 0054 ; MA # ( Ƭ → 'T ) LATIN CAPITAL LETTER T WITH HOOK → APOSTROPHE, LATIN CAPITAL LETTER T # →ʽT→ + +01B3 ; 0027 0059 ; MA # ( Ƴ → 'Y ) LATIN CAPITAL LETTER Y WITH HOOK → APOSTROPHE, LATIN CAPITAL LETTER Y # →ʽY→ + +FF3B ; 0028 ; MA #* ( [ → ( ) FULLWIDTH LEFT SQUARE BRACKET → LEFT PARENTHESIS # →〔→ +2768 ; 0028 ; MA #* ( ❨ → ( ) MEDIUM LEFT PARENTHESIS ORNAMENT → LEFT PARENTHESIS # +2772 ; 0028 ; MA #* ( ❲ → ( ) LIGHT LEFT TORTOISE SHELL BRACKET ORNAMENT → LEFT PARENTHESIS # →〔→ +3014 ; 0028 ; MA #* ( 〔 → ( ) LEFT TORTOISE SHELL BRACKET → LEFT PARENTHESIS # +FD3E ; 0028 ; MA #* ( ﴾ → ( ) ORNATE LEFT PARENTHESIS → LEFT PARENTHESIS # + +2E28 ; 0028 0028 ; MA #* ( ⸨ → (( ) LEFT DOUBLE PARENTHESIS → LEFT PARENTHESIS, LEFT PARENTHESIS # + +3220 ; 0028 30FC 0029 ; MA #* ( ㈠ → (ー) ) PARENTHESIZED IDEOGRAPH ONE → LEFT PARENTHESIS, KATAKANA-HIRAGANA PROLONGED SOUND MARK, RIGHT PARENTHESIS # →(一)→ + +2475 ; 0028 0032 0029 ; MA #* ( ⑵ → (2) ) PARENTHESIZED DIGIT TWO → LEFT PARENTHESIS, DIGIT TWO, RIGHT PARENTHESIS # + +2487 ; 0028 0032 004F 0029 ; MA #* ( ⒇ → (2O) ) PARENTHESIZED NUMBER TWENTY → LEFT PARENTHESIS, DIGIT TWO, LATIN CAPITAL LETTER O, RIGHT PARENTHESIS # →(20)→ + +2476 ; 0028 0033 0029 ; MA #* ( ⑶ → (3) ) PARENTHESIZED DIGIT THREE → LEFT PARENTHESIS, DIGIT THREE, RIGHT PARENTHESIS # + +2477 ; 0028 0034 0029 ; MA #* ( ⑷ → (4) ) PARENTHESIZED DIGIT FOUR → LEFT PARENTHESIS, DIGIT FOUR, RIGHT PARENTHESIS # + +2478 ; 0028 0035 0029 ; MA #* ( ⑸ → (5) ) PARENTHESIZED DIGIT FIVE → LEFT PARENTHESIS, DIGIT FIVE, RIGHT PARENTHESIS # + +2479 ; 0028 0036 0029 ; MA #* ( ⑹ → (6) ) PARENTHESIZED DIGIT SIX → LEFT PARENTHESIS, DIGIT SIX, RIGHT PARENTHESIS # + +247A ; 0028 0037 0029 ; MA #* ( ⑺ → (7) ) PARENTHESIZED DIGIT SEVEN → LEFT PARENTHESIS, DIGIT SEVEN, RIGHT PARENTHESIS # + +247B ; 0028 0038 0029 ; MA #* ( ⑻ → (8) ) PARENTHESIZED DIGIT EIGHT → LEFT PARENTHESIS, DIGIT EIGHT, RIGHT PARENTHESIS # + +247C ; 0028 0039 0029 ; MA #* ( ⑼ → (9) ) PARENTHESIZED DIGIT NINE → LEFT PARENTHESIS, DIGIT NINE, RIGHT PARENTHESIS # + +249C ; 0028 0061 0029 ; MA #* ( ⒜ → (a) ) PARENTHESIZED LATIN SMALL LETTER A → LEFT PARENTHESIS, LATIN SMALL LETTER A, RIGHT PARENTHESIS # + +1F110 ; 0028 0041 0029 ; MA #* ( 🄐 → (A) ) PARENTHESIZED LATIN CAPITAL LETTER A → LEFT PARENTHESIS, LATIN CAPITAL LETTER A, RIGHT PARENTHESIS # + +249D ; 0028 0062 0029 ; MA #* ( ⒝ → (b) ) PARENTHESIZED LATIN SMALL LETTER B → LEFT PARENTHESIS, LATIN SMALL LETTER B, RIGHT PARENTHESIS # + +1F111 ; 0028 0042 0029 ; MA #* ( 🄑 → (B) ) PARENTHESIZED LATIN CAPITAL LETTER B → LEFT PARENTHESIS, LATIN CAPITAL LETTER B, RIGHT PARENTHESIS # + +249E ; 0028 0063 0029 ; MA #* ( ⒞ → (c) ) PARENTHESIZED LATIN SMALL LETTER C → LEFT PARENTHESIS, LATIN SMALL LETTER C, RIGHT PARENTHESIS # + +1F112 ; 0028 0043 0029 ; MA #* ( 🄒 → (C) ) PARENTHESIZED LATIN CAPITAL LETTER C → LEFT PARENTHESIS, LATIN CAPITAL LETTER C, RIGHT PARENTHESIS # + +249F ; 0028 0064 0029 ; MA #* ( ⒟ → (d) ) PARENTHESIZED LATIN SMALL LETTER D → LEFT PARENTHESIS, LATIN SMALL LETTER D, RIGHT PARENTHESIS # + +1F113 ; 0028 0044 0029 ; MA #* ( 🄓 → (D) ) PARENTHESIZED LATIN CAPITAL LETTER D → LEFT PARENTHESIS, LATIN CAPITAL LETTER D, RIGHT PARENTHESIS # + +24A0 ; 0028 0065 0029 ; MA #* ( ⒠ → (e) ) PARENTHESIZED LATIN SMALL LETTER E → LEFT PARENTHESIS, LATIN SMALL LETTER E, RIGHT PARENTHESIS # + +1F114 ; 0028 0045 0029 ; MA #* ( 🄔 → (E) ) PARENTHESIZED LATIN CAPITAL LETTER E → LEFT PARENTHESIS, LATIN CAPITAL LETTER E, RIGHT PARENTHESIS # + +24A1 ; 0028 0066 0029 ; MA #* ( ⒡ → (f) ) PARENTHESIZED LATIN SMALL LETTER F → LEFT PARENTHESIS, LATIN SMALL LETTER F, RIGHT PARENTHESIS # + +1F115 ; 0028 0046 0029 ; MA #* ( 🄕 → (F) ) PARENTHESIZED LATIN CAPITAL LETTER F → LEFT PARENTHESIS, LATIN CAPITAL LETTER F, RIGHT PARENTHESIS # + +24A2 ; 0028 0067 0029 ; MA #* ( ⒢ → (g) ) PARENTHESIZED LATIN SMALL LETTER G → LEFT PARENTHESIS, LATIN SMALL LETTER G, RIGHT PARENTHESIS # + +1F116 ; 0028 0047 0029 ; MA #* ( 🄖 → (G) ) PARENTHESIZED LATIN CAPITAL LETTER G → LEFT PARENTHESIS, LATIN CAPITAL LETTER G, RIGHT PARENTHESIS # + +24A3 ; 0028 0068 0029 ; MA #* ( ⒣ → (h) ) PARENTHESIZED LATIN SMALL LETTER H → LEFT PARENTHESIS, LATIN SMALL LETTER H, RIGHT PARENTHESIS # + +1F117 ; 0028 0048 0029 ; MA #* ( 🄗 → (H) ) PARENTHESIZED LATIN CAPITAL LETTER H → LEFT PARENTHESIS, LATIN CAPITAL LETTER H, RIGHT PARENTHESIS # + +24A4 ; 0028 0069 0029 ; MA #* ( ⒤ → (i) ) PARENTHESIZED LATIN SMALL LETTER I → LEFT PARENTHESIS, LATIN SMALL LETTER I, RIGHT PARENTHESIS # + +24A5 ; 0028 006A 0029 ; MA #* ( ⒥ → (j) ) PARENTHESIZED LATIN SMALL LETTER J → LEFT PARENTHESIS, LATIN SMALL LETTER J, RIGHT PARENTHESIS # + +1F119 ; 0028 004A 0029 ; MA #* ( 🄙 → (J) ) PARENTHESIZED LATIN CAPITAL LETTER J → LEFT PARENTHESIS, LATIN CAPITAL LETTER J, RIGHT PARENTHESIS # + +24A6 ; 0028 006B 0029 ; MA #* ( ⒦ → (k) ) PARENTHESIZED LATIN SMALL LETTER K → LEFT PARENTHESIS, LATIN SMALL LETTER K, RIGHT PARENTHESIS # + +1F11A ; 0028 004B 0029 ; MA #* ( 🄚 → (K) ) PARENTHESIZED LATIN CAPITAL LETTER K → LEFT PARENTHESIS, LATIN CAPITAL LETTER K, RIGHT PARENTHESIS # + +2474 ; 0028 006C 0029 ; MA #* ( ⑴ → (l) ) PARENTHESIZED DIGIT ONE → LEFT PARENTHESIS, LATIN SMALL LETTER L, RIGHT PARENTHESIS # →(1)→ +1F118 ; 0028 006C 0029 ; MA #* ( 🄘 → (l) ) PARENTHESIZED LATIN CAPITAL LETTER I → LEFT PARENTHESIS, LATIN SMALL LETTER L, RIGHT PARENTHESIS # →(I)→ +24A7 ; 0028 006C 0029 ; MA #* ( ⒧ → (l) ) PARENTHESIZED LATIN SMALL LETTER L → LEFT PARENTHESIS, LATIN SMALL LETTER L, RIGHT PARENTHESIS # + +1F11B ; 0028 004C 0029 ; MA #* ( 🄛 → (L) ) PARENTHESIZED LATIN CAPITAL LETTER L → LEFT PARENTHESIS, LATIN CAPITAL LETTER L, RIGHT PARENTHESIS # + +247F ; 0028 006C 0032 0029 ; MA #* ( ⑿ → (l2) ) PARENTHESIZED NUMBER TWELVE → LEFT PARENTHESIS, LATIN SMALL LETTER L, DIGIT TWO, RIGHT PARENTHESIS # →(12)→ + +2480 ; 0028 006C 0033 0029 ; MA #* ( ⒀ → (l3) ) PARENTHESIZED NUMBER THIRTEEN → LEFT PARENTHESIS, LATIN SMALL LETTER L, DIGIT THREE, RIGHT PARENTHESIS # →(13)→ + +2481 ; 0028 006C 0034 0029 ; MA #* ( ⒁ → (l4) ) PARENTHESIZED NUMBER FOURTEEN → LEFT PARENTHESIS, LATIN SMALL LETTER L, DIGIT FOUR, RIGHT PARENTHESIS # →(14)→ + +2482 ; 0028 006C 0035 0029 ; MA #* ( ⒂ → (l5) ) PARENTHESIZED NUMBER FIFTEEN → LEFT PARENTHESIS, LATIN SMALL LETTER L, DIGIT FIVE, RIGHT PARENTHESIS # →(15)→ + +2483 ; 0028 006C 0036 0029 ; MA #* ( ⒃ → (l6) ) PARENTHESIZED NUMBER SIXTEEN → LEFT PARENTHESIS, LATIN SMALL LETTER L, DIGIT SIX, RIGHT PARENTHESIS # →(16)→ + +2484 ; 0028 006C 0037 0029 ; MA #* ( ⒄ → (l7) ) PARENTHESIZED NUMBER SEVENTEEN → LEFT PARENTHESIS, LATIN SMALL LETTER L, DIGIT SEVEN, RIGHT PARENTHESIS # →(17)→ + +2485 ; 0028 006C 0038 0029 ; MA #* ( ⒅ → (l8) ) PARENTHESIZED NUMBER EIGHTEEN → LEFT PARENTHESIS, LATIN SMALL LETTER L, DIGIT EIGHT, RIGHT PARENTHESIS # →(18)→ + +2486 ; 0028 006C 0039 0029 ; MA #* ( ⒆ → (l9) ) PARENTHESIZED NUMBER NINETEEN → LEFT PARENTHESIS, LATIN SMALL LETTER L, DIGIT NINE, RIGHT PARENTHESIS # →(19)→ + +247E ; 0028 006C 006C 0029 ; MA #* ( ⑾ → (ll) ) PARENTHESIZED NUMBER ELEVEN → LEFT PARENTHESIS, LATIN SMALL LETTER L, LATIN SMALL LETTER L, RIGHT PARENTHESIS # →(11)→ + +247D ; 0028 006C 004F 0029 ; MA #* ( ⑽ → (lO) ) PARENTHESIZED NUMBER TEN → LEFT PARENTHESIS, LATIN SMALL LETTER L, LATIN CAPITAL LETTER O, RIGHT PARENTHESIS # →(10)→ + +1F11C ; 0028 004D 0029 ; MA #* ( 🄜 → (M) ) PARENTHESIZED LATIN CAPITAL LETTER M → LEFT PARENTHESIS, LATIN CAPITAL LETTER M, RIGHT PARENTHESIS # + +24A9 ; 0028 006E 0029 ; MA #* ( ⒩ → (n) ) PARENTHESIZED LATIN SMALL LETTER N → LEFT PARENTHESIS, LATIN SMALL LETTER N, RIGHT PARENTHESIS # + +1F11D ; 0028 004E 0029 ; MA #* ( 🄝 → (N) ) PARENTHESIZED LATIN CAPITAL LETTER N → LEFT PARENTHESIS, LATIN CAPITAL LETTER N, RIGHT PARENTHESIS # + +24AA ; 0028 006F 0029 ; MA #* ( ⒪ → (o) ) PARENTHESIZED LATIN SMALL LETTER O → LEFT PARENTHESIS, LATIN SMALL LETTER O, RIGHT PARENTHESIS # + +1F11E ; 0028 004F 0029 ; MA #* ( 🄞 → (O) ) PARENTHESIZED LATIN CAPITAL LETTER O → LEFT PARENTHESIS, LATIN CAPITAL LETTER O, RIGHT PARENTHESIS # + +24AB ; 0028 0070 0029 ; MA #* ( ⒫ → (p) ) PARENTHESIZED LATIN SMALL LETTER P → LEFT PARENTHESIS, LATIN SMALL LETTER P, RIGHT PARENTHESIS # + +1F11F ; 0028 0050 0029 ; MA #* ( 🄟 → (P) ) PARENTHESIZED LATIN CAPITAL LETTER P → LEFT PARENTHESIS, LATIN CAPITAL LETTER P, RIGHT PARENTHESIS # + +24AC ; 0028 0071 0029 ; MA #* ( ⒬ → (q) ) PARENTHESIZED LATIN SMALL LETTER Q → LEFT PARENTHESIS, LATIN SMALL LETTER Q, RIGHT PARENTHESIS # + +1F120 ; 0028 0051 0029 ; MA #* ( 🄠 → (Q) ) PARENTHESIZED LATIN CAPITAL LETTER Q → LEFT PARENTHESIS, LATIN CAPITAL LETTER Q, RIGHT PARENTHESIS # + +24AD ; 0028 0072 0029 ; MA #* ( ⒭ → (r) ) PARENTHESIZED LATIN SMALL LETTER R → LEFT PARENTHESIS, LATIN SMALL LETTER R, RIGHT PARENTHESIS # + +1F121 ; 0028 0052 0029 ; MA #* ( 🄡 → (R) ) PARENTHESIZED LATIN CAPITAL LETTER R → LEFT PARENTHESIS, LATIN CAPITAL LETTER R, RIGHT PARENTHESIS # + +24A8 ; 0028 0072 006E 0029 ; MA #* ( ⒨ → (rn) ) PARENTHESIZED LATIN SMALL LETTER M → LEFT PARENTHESIS, LATIN SMALL LETTER R, LATIN SMALL LETTER N, RIGHT PARENTHESIS # →(m)→ + +24AE ; 0028 0073 0029 ; MA #* ( ⒮ → (s) ) PARENTHESIZED LATIN SMALL LETTER S → LEFT PARENTHESIS, LATIN SMALL LETTER S, RIGHT PARENTHESIS # + +1F122 ; 0028 0053 0029 ; MA #* ( 🄢 → (S) ) PARENTHESIZED LATIN CAPITAL LETTER S → LEFT PARENTHESIS, LATIN CAPITAL LETTER S, RIGHT PARENTHESIS # +1F12A ; 0028 0053 0029 ; MA #* ( 🄪 → (S) ) TORTOISE SHELL BRACKETED LATIN CAPITAL LETTER S → LEFT PARENTHESIS, LATIN CAPITAL LETTER S, RIGHT PARENTHESIS # →〔S〕→ + +24AF ; 0028 0074 0029 ; MA #* ( ⒯ → (t) ) PARENTHESIZED LATIN SMALL LETTER T → LEFT PARENTHESIS, LATIN SMALL LETTER T, RIGHT PARENTHESIS # + +1F123 ; 0028 0054 0029 ; MA #* ( 🄣 → (T) ) PARENTHESIZED LATIN CAPITAL LETTER T → LEFT PARENTHESIS, LATIN CAPITAL LETTER T, RIGHT PARENTHESIS # + +24B0 ; 0028 0075 0029 ; MA #* ( ⒰ → (u) ) PARENTHESIZED LATIN SMALL LETTER U → LEFT PARENTHESIS, LATIN SMALL LETTER U, RIGHT PARENTHESIS # + +1F124 ; 0028 0055 0029 ; MA #* ( 🄤 → (U) ) PARENTHESIZED LATIN CAPITAL LETTER U → LEFT PARENTHESIS, LATIN CAPITAL LETTER U, RIGHT PARENTHESIS # + +24B1 ; 0028 0076 0029 ; MA #* ( ⒱ → (v) ) PARENTHESIZED LATIN SMALL LETTER V → LEFT PARENTHESIS, LATIN SMALL LETTER V, RIGHT PARENTHESIS # + +1F125 ; 0028 0056 0029 ; MA #* ( 🄥 → (V) ) PARENTHESIZED LATIN CAPITAL LETTER V → LEFT PARENTHESIS, LATIN CAPITAL LETTER V, RIGHT PARENTHESIS # + +24B2 ; 0028 0077 0029 ; MA #* ( ⒲ → (w) ) PARENTHESIZED LATIN SMALL LETTER W → LEFT PARENTHESIS, LATIN SMALL LETTER W, RIGHT PARENTHESIS # + +1F126 ; 0028 0057 0029 ; MA #* ( 🄦 → (W) ) PARENTHESIZED LATIN CAPITAL LETTER W → LEFT PARENTHESIS, LATIN CAPITAL LETTER W, RIGHT PARENTHESIS # + +24B3 ; 0028 0078 0029 ; MA #* ( ⒳ → (x) ) PARENTHESIZED LATIN SMALL LETTER X → LEFT PARENTHESIS, LATIN SMALL LETTER X, RIGHT PARENTHESIS # + +1F127 ; 0028 0058 0029 ; MA #* ( 🄧 → (X) ) PARENTHESIZED LATIN CAPITAL LETTER X → LEFT PARENTHESIS, LATIN CAPITAL LETTER X, RIGHT PARENTHESIS # + +24B4 ; 0028 0079 0029 ; MA #* ( ⒴ → (y) ) PARENTHESIZED LATIN SMALL LETTER Y → LEFT PARENTHESIS, LATIN SMALL LETTER Y, RIGHT PARENTHESIS # + +1F128 ; 0028 0059 0029 ; MA #* ( 🄨 → (Y) ) PARENTHESIZED LATIN CAPITAL LETTER Y → LEFT PARENTHESIS, LATIN CAPITAL LETTER Y, RIGHT PARENTHESIS # + +24B5 ; 0028 007A 0029 ; MA #* ( ⒵ → (z) ) PARENTHESIZED LATIN SMALL LETTER Z → LEFT PARENTHESIS, LATIN SMALL LETTER Z, RIGHT PARENTHESIS # + +1F129 ; 0028 005A 0029 ; MA #* ( 🄩 → (Z) ) PARENTHESIZED LATIN CAPITAL LETTER Z → LEFT PARENTHESIS, LATIN CAPITAL LETTER Z, RIGHT PARENTHESIS # + +3200 ; 0028 1100 0029 ; MA #* ( ㈀ → (ᄀ) ) PARENTHESIZED HANGUL KIYEOK → LEFT PARENTHESIS, HANGUL CHOSEONG KIYEOK, RIGHT PARENTHESIS # + +320E ; 0028 AC00 0029 ; MA #* ( ㈎ → (가) ) PARENTHESIZED HANGUL KIYEOK A → LEFT PARENTHESIS, HANGUL SYLLABLE GA, RIGHT PARENTHESIS # + +3201 ; 0028 1102 0029 ; MA #* ( ㈁ → (ᄂ) ) PARENTHESIZED HANGUL NIEUN → LEFT PARENTHESIS, HANGUL CHOSEONG NIEUN, RIGHT PARENTHESIS # + +320F ; 0028 B098 0029 ; MA #* ( ㈏ → (나) ) PARENTHESIZED HANGUL NIEUN A → LEFT PARENTHESIS, HANGUL SYLLABLE NA, RIGHT PARENTHESIS # + +3202 ; 0028 1103 0029 ; MA #* ( ㈂ → (ᄃ) ) PARENTHESIZED HANGUL TIKEUT → LEFT PARENTHESIS, HANGUL CHOSEONG TIKEUT, RIGHT PARENTHESIS # + +3210 ; 0028 B2E4 0029 ; MA #* ( ㈐ → (다) ) PARENTHESIZED HANGUL TIKEUT A → LEFT PARENTHESIS, HANGUL SYLLABLE DA, RIGHT PARENTHESIS # + +3203 ; 0028 1105 0029 ; MA #* ( ㈃ → (ᄅ) ) PARENTHESIZED HANGUL RIEUL → LEFT PARENTHESIS, HANGUL CHOSEONG RIEUL, RIGHT PARENTHESIS # + +3211 ; 0028 B77C 0029 ; MA #* ( ㈑ → (라) ) PARENTHESIZED HANGUL RIEUL A → LEFT PARENTHESIS, HANGUL SYLLABLE RA, RIGHT PARENTHESIS # + +3204 ; 0028 1106 0029 ; MA #* ( ㈄ → (ᄆ) ) PARENTHESIZED HANGUL MIEUM → LEFT PARENTHESIS, HANGUL CHOSEONG MIEUM, RIGHT PARENTHESIS # + +3212 ; 0028 B9C8 0029 ; MA #* ( ㈒ → (마) ) PARENTHESIZED HANGUL MIEUM A → LEFT PARENTHESIS, HANGUL SYLLABLE MA, RIGHT PARENTHESIS # + +3205 ; 0028 1107 0029 ; MA #* ( ㈅ → (ᄇ) ) PARENTHESIZED HANGUL PIEUP → LEFT PARENTHESIS, HANGUL CHOSEONG PIEUP, RIGHT PARENTHESIS # + +3213 ; 0028 BC14 0029 ; MA #* ( ㈓ → (바) ) PARENTHESIZED HANGUL PIEUP A → LEFT PARENTHESIS, HANGUL SYLLABLE BA, RIGHT PARENTHESIS # + +3206 ; 0028 1109 0029 ; MA #* ( ㈆ → (ᄉ) ) PARENTHESIZED HANGUL SIOS → LEFT PARENTHESIS, HANGUL CHOSEONG SIOS, RIGHT PARENTHESIS # + +3214 ; 0028 C0AC 0029 ; MA #* ( ㈔ → (사) ) PARENTHESIZED HANGUL SIOS A → LEFT PARENTHESIS, HANGUL SYLLABLE SA, RIGHT PARENTHESIS # + +3207 ; 0028 110B 0029 ; MA #* ( ㈇ → (ᄋ) ) PARENTHESIZED HANGUL IEUNG → LEFT PARENTHESIS, HANGUL CHOSEONG IEUNG, RIGHT PARENTHESIS # + +3215 ; 0028 C544 0029 ; MA #* ( ㈕ → (아) ) PARENTHESIZED HANGUL IEUNG A → LEFT PARENTHESIS, HANGUL SYLLABLE A, RIGHT PARENTHESIS # + +321D ; 0028 C624 C804 0029 ; MA #* ( ㈝ → (오전) ) PARENTHESIZED KOREAN CHARACTER OJEON → LEFT PARENTHESIS, HANGUL SYLLABLE O, HANGUL SYLLABLE JEON, RIGHT PARENTHESIS # + +321E ; 0028 C624 D6C4 0029 ; MA #* ( ㈞ → (오후) ) PARENTHESIZED KOREAN CHARACTER O HU → LEFT PARENTHESIS, HANGUL SYLLABLE O, HANGUL SYLLABLE HU, RIGHT PARENTHESIS # + +3208 ; 0028 110C 0029 ; MA #* ( ㈈ → (ᄌ) ) PARENTHESIZED HANGUL CIEUC → LEFT PARENTHESIS, HANGUL CHOSEONG CIEUC, RIGHT PARENTHESIS # + +3216 ; 0028 C790 0029 ; MA #* ( ㈖ → (자) ) PARENTHESIZED HANGUL CIEUC A → LEFT PARENTHESIS, HANGUL SYLLABLE JA, RIGHT PARENTHESIS # + +321C ; 0028 C8FC 0029 ; MA #* ( ㈜ → (주) ) PARENTHESIZED HANGUL CIEUC U → LEFT PARENTHESIS, HANGUL SYLLABLE JU, RIGHT PARENTHESIS # + +3209 ; 0028 110E 0029 ; MA #* ( ㈉ → (ᄎ) ) PARENTHESIZED HANGUL CHIEUCH → LEFT PARENTHESIS, HANGUL CHOSEONG CHIEUCH, RIGHT PARENTHESIS # + +3217 ; 0028 CC28 0029 ; MA #* ( ㈗ → (차) ) PARENTHESIZED HANGUL CHIEUCH A → LEFT PARENTHESIS, HANGUL SYLLABLE CA, RIGHT PARENTHESIS # + +320A ; 0028 110F 0029 ; MA #* ( ㈊ → (ᄏ) ) PARENTHESIZED HANGUL KHIEUKH → LEFT PARENTHESIS, HANGUL CHOSEONG KHIEUKH, RIGHT PARENTHESIS # + +3218 ; 0028 CE74 0029 ; MA #* ( ㈘ → (카) ) PARENTHESIZED HANGUL KHIEUKH A → LEFT PARENTHESIS, HANGUL SYLLABLE KA, RIGHT PARENTHESIS # + +320B ; 0028 1110 0029 ; MA #* ( ㈋ → (ᄐ) ) PARENTHESIZED HANGUL THIEUTH → LEFT PARENTHESIS, HANGUL CHOSEONG THIEUTH, RIGHT PARENTHESIS # + +3219 ; 0028 D0C0 0029 ; MA #* ( ㈙ → (타) ) PARENTHESIZED HANGUL THIEUTH A → LEFT PARENTHESIS, HANGUL SYLLABLE TA, RIGHT PARENTHESIS # + +320C ; 0028 1111 0029 ; MA #* ( ㈌ → (ᄑ) ) PARENTHESIZED HANGUL PHIEUPH → LEFT PARENTHESIS, HANGUL CHOSEONG PHIEUPH, RIGHT PARENTHESIS # + +321A ; 0028 D30C 0029 ; MA #* ( ㈚ → (파) ) PARENTHESIZED HANGUL PHIEUPH A → LEFT PARENTHESIS, HANGUL SYLLABLE PA, RIGHT PARENTHESIS # + +320D ; 0028 1112 0029 ; MA #* ( ㈍ → (ᄒ) ) PARENTHESIZED HANGUL HIEUH → LEFT PARENTHESIS, HANGUL CHOSEONG HIEUH, RIGHT PARENTHESIS # + +321B ; 0028 D558 0029 ; MA #* ( ㈛ → (하) ) PARENTHESIZED HANGUL HIEUH A → LEFT PARENTHESIS, HANGUL SYLLABLE HA, RIGHT PARENTHESIS # + +3226 ; 0028 4E03 0029 ; MA #* ( ㈦ → (七) ) PARENTHESIZED IDEOGRAPH SEVEN → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-4E03, RIGHT PARENTHESIS # + +3222 ; 0028 4E09 0029 ; MA #* ( ㈢ → (三) ) PARENTHESIZED IDEOGRAPH THREE → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-4E09, RIGHT PARENTHESIS # +1F241 ; 0028 4E09 0029 ; MA #* ( 🉁 → (三) ) TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-4E09 → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-4E09, RIGHT PARENTHESIS # →〔三〕→ + +3228 ; 0028 4E5D 0029 ; MA #* ( ㈨ → (九) ) PARENTHESIZED IDEOGRAPH NINE → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-4E5D, RIGHT PARENTHESIS # + +3221 ; 0028 4E8C 0029 ; MA #* ( ㈡ → (二) ) PARENTHESIZED IDEOGRAPH TWO → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-4E8C, RIGHT PARENTHESIS # +1F242 ; 0028 4E8C 0029 ; MA #* ( 🉂 → (二) ) TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-4E8C → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-4E8C, RIGHT PARENTHESIS # →〔二〕→ + +3224 ; 0028 4E94 0029 ; MA #* ( ㈤ → (五) ) PARENTHESIZED IDEOGRAPH FIVE → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-4E94, RIGHT PARENTHESIS # + +3239 ; 0028 4EE3 0029 ; MA #* ( ㈹ → (代) ) PARENTHESIZED IDEOGRAPH REPRESENT → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-4EE3, RIGHT PARENTHESIS # + +323D ; 0028 4F01 0029 ; MA #* ( ㈽ → (企) ) PARENTHESIZED IDEOGRAPH ENTERPRISE → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-4F01, RIGHT PARENTHESIS # + +3241 ; 0028 4F11 0029 ; MA #* ( ㉁ → (休) ) PARENTHESIZED IDEOGRAPH REST → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-4F11, RIGHT PARENTHESIS # + +3227 ; 0028 516B 0029 ; MA #* ( ㈧ → (八) ) PARENTHESIZED IDEOGRAPH EIGHT → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-516B, RIGHT PARENTHESIS # + +3225 ; 0028 516D 0029 ; MA #* ( ㈥ → (六) ) PARENTHESIZED IDEOGRAPH SIX → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-516D, RIGHT PARENTHESIS # + +3238 ; 0028 52B4 0029 ; MA #* ( ㈸ → (労) ) PARENTHESIZED IDEOGRAPH LABOR → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-52B4, RIGHT PARENTHESIS # + +1F247 ; 0028 52DD 0029 ; MA #* ( 🉇 → (勝) ) TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-52DD → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-52DD, RIGHT PARENTHESIS # →〔勝〕→ + +3229 ; 0028 5341 0029 ; MA #* ( ㈩ → (十) ) PARENTHESIZED IDEOGRAPH TEN → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-5341, RIGHT PARENTHESIS # + +323F ; 0028 5354 0029 ; MA #* ( ㈿ → (協) ) PARENTHESIZED IDEOGRAPH ALLIANCE → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-5354, RIGHT PARENTHESIS # + +3234 ; 0028 540D 0029 ; MA #* ( ㈴ → (名) ) PARENTHESIZED IDEOGRAPH NAME → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-540D, RIGHT PARENTHESIS # + +323A ; 0028 547C 0029 ; MA #* ( ㈺ → (呼) ) PARENTHESIZED IDEOGRAPH CALL → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-547C, RIGHT PARENTHESIS # + +3223 ; 0028 56DB 0029 ; MA #* ( ㈣ → (四) ) PARENTHESIZED IDEOGRAPH FOUR → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-56DB, RIGHT PARENTHESIS # + +322F ; 0028 571F 0029 ; MA #* ( ㈯ → (土) ) PARENTHESIZED IDEOGRAPH EARTH → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-571F, RIGHT PARENTHESIS # + +323B ; 0028 5B66 0029 ; MA #* ( ㈻ → (学) ) PARENTHESIZED IDEOGRAPH STUDY → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-5B66, RIGHT PARENTHESIS # + +1F243 ; 0028 5B89 0029 ; MA #* ( 🉃 → (安) ) TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-5B89 → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-5B89, RIGHT PARENTHESIS # →〔安〕→ + +1F245 ; 0028 6253 0029 ; MA #* ( 🉅 → (打) ) TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-6253 → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-6253, RIGHT PARENTHESIS # →〔打〕→ + +1F248 ; 0028 6557 0029 ; MA #* ( 🉈 → (敗) ) TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-6557 → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-6557, RIGHT PARENTHESIS # →〔敗〕→ + +3230 ; 0028 65E5 0029 ; MA #* ( ㈰ → (日) ) PARENTHESIZED IDEOGRAPH SUN → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-65E5, RIGHT PARENTHESIS # + +322A ; 0028 6708 0029 ; MA #* ( ㈪ → (月) ) PARENTHESIZED IDEOGRAPH MOON → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-6708, RIGHT PARENTHESIS # + +3232 ; 0028 6709 0029 ; MA #* ( ㈲ → (有) ) PARENTHESIZED IDEOGRAPH HAVE → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-6709, RIGHT PARENTHESIS # + +322D ; 0028 6728 0029 ; MA #* ( ㈭ → (木) ) PARENTHESIZED IDEOGRAPH WOOD → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-6728, RIGHT PARENTHESIS # + +1F240 ; 0028 672C 0029 ; MA #* ( 🉀 → (本) ) TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-672C → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-672C, RIGHT PARENTHESIS # →〔本〕→ + +3231 ; 0028 682A 0029 ; MA #* ( ㈱ → (株) ) PARENTHESIZED IDEOGRAPH STOCK → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-682A, RIGHT PARENTHESIS # + +322C ; 0028 6C34 0029 ; MA #* ( ㈬ → (水) ) PARENTHESIZED IDEOGRAPH WATER → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-6C34, RIGHT PARENTHESIS # + +322B ; 0028 706B 0029 ; MA #* ( ㈫ → (火) ) PARENTHESIZED IDEOGRAPH FIRE → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-706B, RIGHT PARENTHESIS # + +1F244 ; 0028 70B9 0029 ; MA #* ( 🉄 → (点) ) TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-70B9 → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-70B9, RIGHT PARENTHESIS # →〔点〕→ + +3235 ; 0028 7279 0029 ; MA #* ( ㈵ → (特) ) PARENTHESIZED IDEOGRAPH SPECIAL → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-7279, RIGHT PARENTHESIS # + +1F246 ; 0028 76D7 0029 ; MA #* ( 🉆 → (盗) ) TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-76D7 → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-76D7, RIGHT PARENTHESIS # →〔盗〕→ + +323C ; 0028 76E3 0029 ; MA #* ( ㈼ → (監) ) PARENTHESIZED IDEOGRAPH SUPERVISE → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-76E3, RIGHT PARENTHESIS # + +3233 ; 0028 793E 0029 ; MA #* ( ㈳ → (社) ) PARENTHESIZED IDEOGRAPH SOCIETY → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-793E, RIGHT PARENTHESIS # + +3237 ; 0028 795D 0029 ; MA #* ( ㈷ → (祝) ) PARENTHESIZED IDEOGRAPH CONGRATULATION → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-795D, RIGHT PARENTHESIS # + +3240 ; 0028 796D 0029 ; MA #* ( ㉀ → (祭) ) PARENTHESIZED IDEOGRAPH FESTIVAL → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-796D, RIGHT PARENTHESIS # + +3242 ; 0028 81EA 0029 ; MA #* ( ㉂ → (自) ) PARENTHESIZED IDEOGRAPH SELF → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-81EA, RIGHT PARENTHESIS # + +3243 ; 0028 81F3 0029 ; MA #* ( ㉃ → (至) ) PARENTHESIZED IDEOGRAPH REACH → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-81F3, RIGHT PARENTHESIS # + +3236 ; 0028 8CA1 0029 ; MA #* ( ㈶ → (財) ) PARENTHESIZED IDEOGRAPH FINANCIAL → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-8CA1, RIGHT PARENTHESIS # + +323E ; 0028 8CC7 0029 ; MA #* ( ㈾ → (資) ) PARENTHESIZED IDEOGRAPH RESOURCE → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-8CC7, RIGHT PARENTHESIS # + +322E ; 0028 91D1 0029 ; MA #* ( ㈮ → (金) ) PARENTHESIZED IDEOGRAPH METAL → LEFT PARENTHESIS, CJK UNIFIED IDEOGRAPH-91D1, RIGHT PARENTHESIS # + +FF3D ; 0029 ; MA #* ( ] → ) ) FULLWIDTH RIGHT SQUARE BRACKET → RIGHT PARENTHESIS # →〕→ +2769 ; 0029 ; MA #* ( ❩ → ) ) MEDIUM RIGHT PARENTHESIS ORNAMENT → RIGHT PARENTHESIS # +2773 ; 0029 ; MA #* ( ❳ → ) ) LIGHT RIGHT TORTOISE SHELL BRACKET ORNAMENT → RIGHT PARENTHESIS # →〕→ +3015 ; 0029 ; MA #* ( 〕 → ) ) RIGHT TORTOISE SHELL BRACKET → RIGHT PARENTHESIS # +FD3F ; 0029 ; MA #* ( ﴿ → ) ) ORNATE RIGHT PARENTHESIS → RIGHT PARENTHESIS # + +2E29 ; 0029 0029 ; MA #* ( ⸩ → )) ) RIGHT DOUBLE PARENTHESIS → RIGHT PARENTHESIS, RIGHT PARENTHESIS # + +2774 ; 007B ; MA #* ( ❴ → { ) MEDIUM LEFT CURLY BRACKET ORNAMENT → LEFT CURLY BRACKET # +1D114 ; 007B ; MA #* ( 𝄔 → { ) MUSICAL SYMBOL BRACE → LEFT CURLY BRACKET # + +2775 ; 007D ; MA #* ( ❵ → } ) MEDIUM RIGHT CURLY BRACKET ORNAMENT → RIGHT CURLY BRACKET # + +301A ; 27E6 ; MA #* ( 〚 → ⟦ ) LEFT WHITE SQUARE BRACKET → MATHEMATICAL LEFT WHITE SQUARE BRACKET # + +301B ; 27E7 ; MA #* ( 〛 → ⟧ ) RIGHT WHITE SQUARE BRACKET → MATHEMATICAL RIGHT WHITE SQUARE BRACKET # + +27E8 ; 276C ; MA #* ( ⟨ → ❬ ) MATHEMATICAL LEFT ANGLE BRACKET → MEDIUM LEFT-POINTING ANGLE BRACKET ORNAMENT # →〈→ +2329 ; 276C ; MA #* ( 〈 → ❬ ) LEFT-POINTING ANGLE BRACKET → MEDIUM LEFT-POINTING ANGLE BRACKET ORNAMENT # →〈→ +3008 ; 276C ; MA #* ( 〈 → ❬ ) LEFT ANGLE BRACKET → MEDIUM LEFT-POINTING ANGLE BRACKET ORNAMENT # +31DB ; 276C ; MA #* ( ㇛ → ❬ ) CJK STROKE PD → MEDIUM LEFT-POINTING ANGLE BRACKET ORNAMENT # →⟨→→〈→ +304F ; 276C ; MA # ( く → ❬ ) HIRAGANA LETTER KU → MEDIUM LEFT-POINTING ANGLE BRACKET ORNAMENT # →㇛→→⟨→→〈→ +21FE8 ; 276C ; MA # ( 𡿨 → ❬ ) CJK UNIFIED IDEOGRAPH-21FE8 → MEDIUM LEFT-POINTING ANGLE BRACKET ORNAMENT # →㇛→→⟨→→〈→ + +27E9 ; 276D ; MA #* ( ⟩ → ❭ ) MATHEMATICAL RIGHT ANGLE BRACKET → MEDIUM RIGHT-POINTING ANGLE BRACKET ORNAMENT # →〉→ +232A ; 276D ; MA #* ( 〉 → ❭ ) RIGHT-POINTING ANGLE BRACKET → MEDIUM RIGHT-POINTING ANGLE BRACKET ORNAMENT # →〉→ +3009 ; 276D ; MA #* ( 〉 → ❭ ) RIGHT ANGLE BRACKET → MEDIUM RIGHT-POINTING ANGLE BRACKET ORNAMENT # + +FF3E ; FE3F ; MA #* ( ^ → ︿ ) FULLWIDTH CIRCUMFLEX ACCENT → PRESENTATION FORM FOR VERTICAL LEFT ANGLE BRACKET # + +2E3F ; 00B6 ; MA #* ( ⸿ → ¶ ) CAPITULUM → PILCROW SIGN # + +204E ; 002A ; MA #* ( ⁎ → * ) LOW ASTERISK → ASTERISK # +066D ; 002A ; MA #* ( ‎٭‎ → * ) ARABIC FIVE POINTED STAR → ASTERISK # +2217 ; 002A ; MA #* ( ∗ → * ) ASTERISK OPERATOR → ASTERISK # +1031F ; 002A ; MA # ( 𐌟 → * ) OLD ITALIC LETTER ESS → ASTERISK # + +1735 ; 002F ; MA #* ( ᜵ → / ) PHILIPPINE SINGLE PUNCTUATION → SOLIDUS # +2041 ; 002F ; MA #* ( ⁁ → / ) CARET INSERTION POINT → SOLIDUS # +2215 ; 002F ; MA #* ( ∕ → / ) DIVISION SLASH → SOLIDUS # +2044 ; 002F ; MA #* ( ⁄ → / ) FRACTION SLASH → SOLIDUS # +2571 ; 002F ; MA #* ( ╱ → / ) BOX DRAWINGS LIGHT DIAGONAL UPPER RIGHT TO LOWER LEFT → SOLIDUS # +27CB ; 002F ; MA #* ( ⟋ → / ) MATHEMATICAL RISING DIAGONAL → SOLIDUS # +29F8 ; 002F ; MA #* ( ⧸ → / ) BIG SOLIDUS → SOLIDUS # +1D23A ; 002F ; MA #* ( 𝈺 → / ) GREEK INSTRUMENTAL NOTATION SYMBOL-47 → SOLIDUS # +31D3 ; 002F ; MA #* ( ㇓ → / ) CJK STROKE SP → SOLIDUS # →⼃→ +3033 ; 002F ; MA # ( 〳 → / ) VERTICAL KANA REPEAT MARK UPPER HALF → SOLIDUS # +2CC7 ; 002F ; MA # ( ⳇ → / ) COPTIC SMALL LETTER OLD COPTIC ESH → SOLIDUS # +2CC6 ; 002F ; MA # ( Ⳇ → / ) COPTIC CAPITAL LETTER OLD COPTIC ESH → SOLIDUS # +30CE ; 002F ; MA # ( ノ → / ) KATAKANA LETTER NO → SOLIDUS # →⼃→ +4E3F ; 002F ; MA # ( 丿 → / ) CJK UNIFIED IDEOGRAPH-4E3F → SOLIDUS # →⼃→ +2F03 ; 002F ; MA #* ( ⼃ → / ) KANGXI RADICAL SLASH → SOLIDUS # + +29F6 ; 002F 0304 ; MA #* ( ⧶ → /̄ ) SOLIDUS WITH OVERBAR → SOLIDUS, COMBINING MACRON # + +2AFD ; 002F 002F ; MA #* ( ⫽ → // ) DOUBLE SOLIDUS OPERATOR → SOLIDUS, SOLIDUS # + +2AFB ; 002F 002F 002F ; MA #* ( ⫻ → /// ) TRIPLE SOLIDUS BINARY RELATION → SOLIDUS, SOLIDUS, SOLIDUS # + +FF3C ; 005C ; MA #* ( \ → \ ) FULLWIDTH REVERSE SOLIDUS → REVERSE SOLIDUS # →∖→ +FE68 ; 005C ; MA #* ( ﹨ → \ ) SMALL REVERSE SOLIDUS → REVERSE SOLIDUS # →∖→ +2216 ; 005C ; MA #* ( ∖ → \ ) SET MINUS → REVERSE SOLIDUS # +27CD ; 005C ; MA #* ( ⟍ → \ ) MATHEMATICAL FALLING DIAGONAL → REVERSE SOLIDUS # +29F5 ; 005C ; MA #* ( ⧵ → \ ) REVERSE SOLIDUS OPERATOR → REVERSE SOLIDUS # +29F9 ; 005C ; MA #* ( ⧹ → \ ) BIG REVERSE SOLIDUS → REVERSE SOLIDUS # +1D20F ; 005C ; MA #* ( 𝈏 → \ ) GREEK VOCAL NOTATION SYMBOL-16 → REVERSE SOLIDUS # +1D23B ; 005C ; MA #* ( 𝈻 → \ ) GREEK INSTRUMENTAL NOTATION SYMBOL-48 → REVERSE SOLIDUS # →𝈏→ +31D4 ; 005C ; MA #* ( ㇔ → \ ) CJK STROKE D → REVERSE SOLIDUS # →⼂→ +4E36 ; 005C ; MA # ( 丶 → \ ) CJK UNIFIED IDEOGRAPH-4E36 → REVERSE SOLIDUS # →⼂→ +2F02 ; 005C ; MA #* ( ⼂ → \ ) KANGXI RADICAL DOT → REVERSE SOLIDUS # + +2CF9 ; 005C 005C ; MA #* ( ⳹ → \\ ) COPTIC OLD NUBIAN FULL STOP → REVERSE SOLIDUS, REVERSE SOLIDUS # +244A ; 005C 005C ; MA #* ( ⑊ → \\ ) OCR DOUBLE BACKSLASH → REVERSE SOLIDUS, REVERSE SOLIDUS # + +27C8 ; 005C 1455 ; MA #* ( ⟈ → \ᑕ ) REVERSE SOLIDUS PRECEDING SUBSET → REVERSE SOLIDUS, CANADIAN SYLLABICS TA # →\⊂→ + +A778 ; 0026 ; MA # ( ꝸ → & ) LATIN SMALL LETTER UM → AMPERSAND # + +0AF0 ; 0970 ; MA #* ( ૰ → ॰ ) GUJARATI ABBREVIATION SIGN → DEVANAGARI ABBREVIATION SIGN # +110BB ; 0970 ; MA #* ( 𑂻 → ॰ ) KAITHI ABBREVIATION SIGN → DEVANAGARI ABBREVIATION SIGN # +111C7 ; 0970 ; MA #* ( 𑇇 → ॰ ) SHARADA ABBREVIATION SIGN → DEVANAGARI ABBREVIATION SIGN # +26AC ; 0970 ; MA #* ( ⚬ → ॰ ) MEDIUM SMALL WHITE CIRCLE → DEVANAGARI ABBREVIATION SIGN # + +111DB ; A8FC ; MA #* ( 𑇛 → ꣼ ) SHARADA SIGN SIDDHAM → DEVANAGARI SIGN SIDDHAM # + +17D9 ; 0E4F ; MA #* ( ៙ → ๏ ) KHMER SIGN PHNAEK MUAN → THAI CHARACTER FONGMAN # + +17D5 ; 0E5A ; MA #* ( ៕ → ๚ ) KHMER SIGN BARIYOOSAN → THAI CHARACTER ANGKHANKHU # + +17DA ; 0E5B ; MA #* ( ៚ → ๛ ) KHMER SIGN KOOMUUT → THAI CHARACTER KHOMUT # + +0F0C ; 0F0B ; MA #* ( ༌ → ་ ) TIBETAN MARK DELIMITER TSHEG BSTAR → TIBETAN MARK INTERSYLLABIC TSHEG # + +0F0E ; 0F0D 0F0D ; MA #* ( ༎ → །། ) TIBETAN MARK NYIS SHAD → TIBETAN MARK SHAD, TIBETAN MARK SHAD # + +02C4 ; 005E ; MA #* ( ˄ → ^ ) MODIFIER LETTER UP ARROWHEAD → CIRCUMFLEX ACCENT # +02C6 ; 005E ; MA # ( ˆ → ^ ) MODIFIER LETTER CIRCUMFLEX ACCENT → CIRCUMFLEX ACCENT # + +A67E ; 02C7 ; MA #* ( ꙾ → ˇ ) CYRILLIC KAVYKA → CARON # →˘→ +02D8 ; 02C7 ; MA #* ( ˘ → ˇ ) BREVE → CARON # + +203E ; 02C9 ; MA #* ( ‾ → ˉ ) OVERLINE → MODIFIER LETTER MACRON # +FE49 ; 02C9 ; MA #* ( ﹉ → ˉ ) DASHED OVERLINE → MODIFIER LETTER MACRON # →‾→ +FE4A ; 02C9 ; MA #* ( ﹊ → ˉ ) CENTRELINE OVERLINE → MODIFIER LETTER MACRON # →‾→ +FE4B ; 02C9 ; MA #* ( ﹋ → ˉ ) WAVY OVERLINE → MODIFIER LETTER MACRON # →‾→ +FE4C ; 02C9 ; MA #* ( ﹌ → ˉ ) DOUBLE WAVY OVERLINE → MODIFIER LETTER MACRON # →‾→ +00AF ; 02C9 ; MA #* ( ¯ → ˉ ) MACRON → MODIFIER LETTER MACRON # +FFE3 ; 02C9 ; MA #* (  ̄ → ˉ ) FULLWIDTH MACRON → MODIFIER LETTER MACRON # →‾→ +2594 ; 02C9 ; MA #* ( ▔ → ˉ ) UPPER ONE EIGHTH BLOCK → MODIFIER LETTER MACRON # →¯→ + +044A ; 02C9 0062 ; MA # ( ъ → ˉb ) CYRILLIC SMALL LETTER HARD SIGN → MODIFIER LETTER MACRON, LATIN SMALL LETTER B # →¯b→ + +A651 ; 02C9 0062 0069 ; MA # ( ꙑ → ˉbi ) CYRILLIC SMALL LETTER YERU WITH BACK YER → MODIFIER LETTER MACRON, LATIN SMALL LETTER B, LATIN SMALL LETTER I # →ъı→ + +0375 ; 02CF ; MA #* ( ͵ → ˏ ) GREEK LOWER NUMERAL SIGN → MODIFIER LETTER LOW ACUTE ACCENT # + +02FB ; 02EA ; MA #* ( ˻ → ˪ ) MODIFIER LETTER BEGIN LOW TONE → MODIFIER LETTER YIN DEPARTING TONE MARK # +A716 ; 02EA ; MA #* ( ꜖ → ˪ ) MODIFIER LETTER EXTRA-LOW LEFT-STEM TONE BAR → MODIFIER LETTER YIN DEPARTING TONE MARK # + +A714 ; 02EB ; MA #* ( ꜔ → ˫ ) MODIFIER LETTER MID LEFT-STEM TONE BAR → MODIFIER LETTER YANG DEPARTING TONE MARK # + +3002 ; 02F3 ; MA #* ( 。 → ˳ ) IDEOGRAPHIC FULL STOP → MODIFIER LETTER LOW RING # + +2E30 ; 00B0 ; MA #* ( ⸰ → ° ) RING POINT → DEGREE SIGN # →∘→ +02DA ; 00B0 ; MA #* ( ˚ → ° ) RING ABOVE → DEGREE SIGN # +2218 ; 00B0 ; MA #* ( ∘ → ° ) RING OPERATOR → DEGREE SIGN # +25CB ; 00B0 ; MA #* ( ○ → ° ) WHITE CIRCLE → DEGREE SIGN # →◦→→∘→ +25E6 ; 00B0 ; MA #* ( ◦ → ° ) WHITE BULLET → DEGREE SIGN # →∘→ + +235C ; 00B0 0332 ; MA #* ( ⍜ → °̲ ) APL FUNCTIONAL SYMBOL CIRCLE UNDERBAR → DEGREE SIGN, COMBINING LOW LINE # →○̲→ +10ED0 ; 00B0 0332 ; MA #* ( 𐻐 → °̲ ) ARABIC BIBLICAL END OF VERSE → DEGREE SIGN, COMBINING LOW LINE # →⍜→→○̲→ + +2364 ; 00B0 0308 ; MA #* ( ⍤ → °̈ ) APL FUNCTIONAL SYMBOL JOT DIAERESIS → DEGREE SIGN, COMBINING DIAERESIS # →◦̈→→∘̈→ + +2103 ; 00B0 0043 ; MA #* ( ℃ → °C ) DEGREE CELSIUS → DEGREE SIGN, LATIN CAPITAL LETTER C # + +2109 ; 00B0 0046 ; MA #* ( ℉ → °F ) DEGREE FAHRENHEIT → DEGREE SIGN, LATIN CAPITAL LETTER F # + +0BF5 ; 0BF3 ; MA #* ( ௵ → ௳ ) TAMIL YEAR SIGN → TAMIL DAY SIGN # + +0F1B ; 0F1A 0F1A ; MA #* ( ༛ → ༚༚ ) TIBETAN SIGN RDEL DKAR GNYIS → TIBETAN SIGN RDEL DKAR GCIG, TIBETAN SIGN RDEL DKAR GCIG # + +0F1F ; 0F1A 0F1D ; MA #* ( ༟ → ༚༝ ) TIBETAN SIGN RDEL DKAR RDEL NAG → TIBETAN SIGN RDEL DKAR GCIG, TIBETAN SIGN RDEL NAG GCIG # + +0FCE ; 0F1D 0F1A ; MA #* ( ࿎ → ༝༚ ) TIBETAN SIGN RDEL NAG RDEL DKAR → TIBETAN SIGN RDEL NAG GCIG, TIBETAN SIGN RDEL DKAR GCIG # + +0F1E ; 0F1D 0F1D ; MA #* ( ༞ → ༝༝ ) TIBETAN SIGN RDEL NAG GNYIS → TIBETAN SIGN RDEL NAG GCIG, TIBETAN SIGN RDEL NAG GCIG # + +24B8 ; 00A9 ; MA #* ( Ⓒ → © ) CIRCLED LATIN CAPITAL LETTER C → COPYRIGHT SIGN # + +24C7 ; 00AE ; MA #* ( Ⓡ → ® ) CIRCLED LATIN CAPITAL LETTER R → REGISTERED SIGN # + +24C5 ; 2117 ; MA #* ( Ⓟ → ℗ ) CIRCLED LATIN CAPITAL LETTER P → SOUND RECORDING COPYRIGHT # + +1D21B ; 2144 ; MA #* ( 𝈛 → ⅄ ) GREEK VOCAL NOTATION SYMBOL-53 → TURNED SANS-SERIF CAPITAL Y # + +2BEC ; 219E ; MA #* ( ⯬ → ↞ ) LEFTWARDS TWO-HEADED ARROW WITH TRIANGLE ARROWHEADS → LEFTWARDS TWO HEADED ARROW # + +2BED ; 219F ; MA #* ( ⯭ → ↟ ) UPWARDS TWO-HEADED ARROW WITH TRIANGLE ARROWHEADS → UPWARDS TWO HEADED ARROW # + +2BEE ; 21A0 ; MA #* ( ⯮ → ↠ ) RIGHTWARDS TWO-HEADED ARROW WITH TRIANGLE ARROWHEADS → RIGHTWARDS TWO HEADED ARROW # + +2BEF ; 21A1 ; MA #* ( ⯯ → ↡ ) DOWNWARDS TWO-HEADED ARROW WITH TRIANGLE ARROWHEADS → DOWNWARDS TWO HEADED ARROW # + +21B5 ; 21B2 ; MA #* ( ↵ → ↲ ) DOWNWARDS ARROW WITH CORNER LEFTWARDS → DOWNWARDS ARROW WITH TIP LEFTWARDS # + +2965 ; 21C3 21C2 ; MA #* ( ⥥ → ⇃⇂ ) DOWNWARDS HARPOON WITH BARB LEFT BESIDE DOWNWARDS HARPOON WITH BARB RIGHT → DOWNWARDS HARPOON WITH BARB LEFTWARDS, DOWNWARDS HARPOON WITH BARB RIGHTWARDS # + +296F ; 21C3 16DA ; MA #* ( ⥯ → ⇃ᛚ ) DOWNWARDS HARPOON WITH BARB LEFT BESIDE UPWARDS HARPOON WITH BARB RIGHT → DOWNWARDS HARPOON WITH BARB LEFTWARDS, RUNIC LETTER LAUKAZ LAGU LOGR L # →⇃↾→ + +1D6DB ; 2202 ; MA #* ( 𝛛 → ∂ ) MATHEMATICAL BOLD PARTIAL DIFFERENTIAL → PARTIAL DIFFERENTIAL # +1D715 ; 2202 ; MA #* ( 𝜕 → ∂ ) MATHEMATICAL ITALIC PARTIAL DIFFERENTIAL → PARTIAL DIFFERENTIAL # +1D74F ; 2202 ; MA #* ( 𝝏 → ∂ ) MATHEMATICAL BOLD ITALIC PARTIAL DIFFERENTIAL → PARTIAL DIFFERENTIAL # +1D789 ; 2202 ; MA #* ( 𝞉 → ∂ ) MATHEMATICAL SANS-SERIF BOLD PARTIAL DIFFERENTIAL → PARTIAL DIFFERENTIAL # +1D7C3 ; 2202 ; MA #* ( 𝟃 → ∂ ) MATHEMATICAL SANS-SERIF BOLD ITALIC PARTIAL DIFFERENTIAL → PARTIAL DIFFERENTIAL # +1E8CC ; 2202 ; MA #* ( ‎𞣌‎ → ∂ ) MENDE KIKAKUI DIGIT SIX → PARTIAL DIFFERENTIAL # + +1E8CD ; 2202 0335 ; MA #* ( ‎𞣍‎ → ∂̵ ) MENDE KIKAKUI DIGIT SEVEN → PARTIAL DIFFERENTIAL, COMBINING SHORT STROKE OVERLAY # →ð→ +00F0 ; 2202 0335 ; MA # ( ð → ∂̵ ) LATIN SMALL LETTER ETH → PARTIAL DIFFERENTIAL, COMBINING SHORT STROKE OVERLAY # + +2300 ; 2205 ; MA #* ( ⌀ → ∅ ) DIAMETER SIGN → EMPTY SET # + +1D6C1 ; 2207 ; MA #* ( 𝛁 → ∇ ) MATHEMATICAL BOLD NABLA → NABLA # +1D6FB ; 2207 ; MA #* ( 𝛻 → ∇ ) MATHEMATICAL ITALIC NABLA → NABLA # +1D735 ; 2207 ; MA #* ( 𝜵 → ∇ ) MATHEMATICAL BOLD ITALIC NABLA → NABLA # +1D76F ; 2207 ; MA #* ( 𝝯 → ∇ ) MATHEMATICAL SANS-SERIF BOLD NABLA → NABLA # +1D7A9 ; 2207 ; MA #* ( 𝞩 → ∇ ) MATHEMATICAL SANS-SERIF BOLD ITALIC NABLA → NABLA # +118A8 ; 2207 ; MA # ( 𑢨 → ∇ ) WARANG CITI CAPITAL LETTER E → NABLA # + +2362 ; 2207 0308 ; MA #* ( ⍢ → ∇̈ ) APL FUNCTIONAL SYMBOL DEL DIAERESIS → NABLA, COMBINING DIAERESIS # + +236B ; 2207 0334 ; MA #* ( ⍫ → ∇̴ ) APL FUNCTIONAL SYMBOL DEL TILDE → NABLA, COMBINING TILDE OVERLAY # + +2588 ; 220E ; MA #* ( █ → ∎ ) FULL BLOCK → END OF PROOF # →■→ +25A0 ; 220E ; MA #* ( ■ → ∎ ) BLACK SQUARE → END OF PROOF # + +2A3F ; 2210 ; MA #* ( ⨿ → ∐ ) AMALGAMATION OR COPRODUCT → N-ARY COPRODUCT # + +16ED ; 002B ; MA #* ( ᛭ → + ) RUNIC CROSS PUNCTUATION → PLUS SIGN # +2795 ; 002B ; MA #* ( ➕ → + ) HEAVY PLUS SIGN → PLUS SIGN # +1029B ; 002B ; MA # ( 𐊛 → + ) LYCIAN LETTER H → PLUS SIGN # +1E6E9 ; 002B ; MA # ( 𞛩 → + ) TAI YO LETTER IA → PLUS SIGN # + +2A23 ; 002B 0302 ; MA #* ( ⨣ → +̂ ) PLUS SIGN WITH CIRCUMFLEX ACCENT ABOVE → PLUS SIGN, COMBINING CIRCUMFLEX ACCENT # + +2A22 ; 002B 030A ; MA #* ( ⨢ → +̊ ) PLUS SIGN WITH SMALL CIRCLE ABOVE → PLUS SIGN, COMBINING RING ABOVE # + +2A24 ; 002B 0303 ; MA #* ( ⨤ → +̃ ) PLUS SIGN WITH TILDE ABOVE → PLUS SIGN, COMBINING TILDE # + +2214 ; 002B 0307 ; MA #* ( ∔ → +̇ ) DOT PLUS → PLUS SIGN, COMBINING DOT ABOVE # + +2A25 ; 002B 0323 ; MA #* ( ⨥ → +̣ ) PLUS SIGN WITH DOT BELOW → PLUS SIGN, COMBINING DOT BELOW # + +2A26 ; 002B 0330 ; MA #* ( ⨦ → +̰ ) PLUS SIGN WITH TILDE BELOW → PLUS SIGN, COMBINING TILDE BELOW # + +2A27 ; 002B 2082 ; MA #* ( ⨧ → +₂ ) PLUS SIGN WITH SUBSCRIPT TWO → PLUS SIGN, SUBSCRIPT TWO # + +2797 ; 00F7 ; MA #* ( ➗ → ÷ ) HEAVY DIVISION SIGN → DIVISION SIGN # + +2039 ; 003C ; MA #* ( ‹ → < ) SINGLE LEFT-POINTING ANGLE QUOTATION MARK → LESS-THAN SIGN # +276E ; 003C ; MA #* ( ❮ → < ) HEAVY LEFT-POINTING ANGLE QUOTATION MARK ORNAMENT → LESS-THAN SIGN # →‹→ +02C2 ; 003C ; MA #* ( ˂ → < ) MODIFIER LETTER LEFT ARROWHEAD → LESS-THAN SIGN # +1D236 ; 003C ; MA #* ( 𝈶 → < ) GREEK INSTRUMENTAL NOTATION SYMBOL-40 → LESS-THAN SIGN # +1438 ; 003C ; MA # ( ᐸ → < ) CANADIAN SYLLABICS PA → LESS-THAN SIGN # +16B2 ; 003C ; MA # ( ᚲ → < ) RUNIC LETTER KAUNA → LESS-THAN SIGN # + +22D6 ; 003C 00B7 ; MA #* ( ⋖ → <· ) LESS-THAN WITH DOT → LESS-THAN SIGN, MIDDLE DOT # →ᑅ→→ᐸᐧ→ +2CB5 ; 003C 00B7 ; MA # ( ⲵ → <· ) COPTIC SMALL LETTER OLD COPTIC AIN → LESS-THAN SIGN, MIDDLE DOT # →⋖→→ᑅ→→ᐸᐧ→ +2CB4 ; 003C 00B7 ; MA # ( Ⲵ → <· ) COPTIC CAPITAL LETTER OLD COPTIC AIN → LESS-THAN SIGN, MIDDLE DOT # →ᑅ→→ᐸᐧ→ +1445 ; 003C 00B7 ; MA # ( ᑅ → <· ) CANADIAN SYLLABICS WEST-CREE PWA → LESS-THAN SIGN, MIDDLE DOT # →ᐸᐧ→ + +226A ; 003C 003C ; MA #* ( ≪ → << ) MUCH LESS-THAN → LESS-THAN SIGN, LESS-THAN SIGN # + +22D8 ; 003C 003C 003C ; MA #* ( ⋘ → <<< ) VERY MUCH LESS-THAN → LESS-THAN SIGN, LESS-THAN SIGN, LESS-THAN SIGN # + +1400 ; 003D ; MA #* ( ᐀ → = ) CANADIAN SYLLABICS HYPHEN → EQUALS SIGN # +2E40 ; 003D ; MA #* ( ⹀ → = ) DOUBLE HYPHEN → EQUALS SIGN # +30A0 ; 003D ; MA #* ( ゠ → = ) KATAKANA-HIRAGANA DOUBLE HYPHEN → EQUALS SIGN # +A4FF ; 003D ; MA #* ( ꓿ → = ) LISU PUNCTUATION FULL STOP → EQUALS SIGN # + +225A ; 003D 0306 ; MA #* ( ≚ → =̆ ) EQUIANGULAR TO → EQUALS SIGN, COMBINING BREVE # →=̌→ + +2259 ; 003D 0302 ; MA #* ( ≙ → =̂ ) ESTIMATES → EQUALS SIGN, COMBINING CIRCUMFLEX ACCENT # + +2257 ; 003D 030A ; MA #* ( ≗ → =̊ ) RING EQUAL TO → EQUALS SIGN, COMBINING RING ABOVE # + +2250 ; 003D 0307 ; MA #* ( ≐ → =̇ ) APPROACHES THE LIMIT → EQUALS SIGN, COMBINING DOT ABOVE # + +2251 ; 003D 0307 0323 ; MA #* ( ≑ → =̣̇ ) GEOMETRICALLY EQUAL TO → EQUALS SIGN, COMBINING DOT ABOVE, COMBINING DOT BELOW # →≐̣→ + +2B96 ; 003D 1AB2 ; MA #* ( ⮖ → =᪲ ) EQUALS SIGN WITH INFINITY ABOVE → EQUALS SIGN, COMBINING INFINITY # + +2A6E ; 003D 20F0 ; MA #* ( ⩮ → =⃰ ) EQUALS WITH ASTERISK → EQUALS SIGN, COMBINING ASTERISK ABOVE # + +2A75 ; 003D 003D ; MA #* ( ⩵ → == ) TWO CONSECUTIVE EQUALS SIGNS → EQUALS SIGN, EQUALS SIGN # + +2A76 ; 003D 003D 003D ; MA #* ( ⩶ → === ) THREE CONSECUTIVE EQUALS SIGNS → EQUALS SIGN, EQUALS SIGN, EQUALS SIGN # + +225E ; 003D 036B ; MA #* ( ≞ → =ͫ ) MEASURED BY → EQUALS SIGN, COMBINING LATIN SMALL LETTER M # + +203A ; 003E ; MA #* ( › → > ) SINGLE RIGHT-POINTING ANGLE QUOTATION MARK → GREATER-THAN SIGN # +276F ; 003E ; MA #* ( ❯ → > ) HEAVY RIGHT-POINTING ANGLE QUOTATION MARK ORNAMENT → GREATER-THAN SIGN # →›→ +02C3 ; 003E ; MA #* ( ˃ → > ) MODIFIER LETTER RIGHT ARROWHEAD → GREATER-THAN SIGN # +1D237 ; 003E ; MA #* ( 𝈷 → > ) GREEK INSTRUMENTAL NOTATION SYMBOL-42 → GREATER-THAN SIGN # +1433 ; 003E ; MA # ( ᐳ → > ) CANADIAN SYLLABICS PO → GREATER-THAN SIGN # +16F3F ; 003E ; MA # ( 𖼿 → > ) MIAO LETTER ARCHAIC ZZA → GREATER-THAN SIGN # + +1441 ; 003E 00B7 ; MA # ( ᑁ → >· ) CANADIAN SYLLABICS WEST-CREE PWO → GREATER-THAN SIGN, MIDDLE DOT # →ᐳᐧ→ + +2AA5 ; 003E 003C ; MA #* ( ⪥ → >< ) GREATER-THAN BESIDE LESS-THAN → GREATER-THAN SIGN, LESS-THAN SIGN # + +226B ; 003E 003E ; MA #* ( ≫ → >> ) MUCH GREATER-THAN → GREATER-THAN SIGN, GREATER-THAN SIGN # +2A20 ; 003E 003E ; MA #* ( ⨠ → >> ) Z NOTATION SCHEMA PIPING → GREATER-THAN SIGN, GREATER-THAN SIGN # →≫→ + +22D9 ; 003E 003E 003E ; MA #* ( ⋙ → >>> ) VERY MUCH GREATER-THAN → GREATER-THAN SIGN, GREATER-THAN SIGN, GREATER-THAN SIGN # + +2053 ; 007E ; MA #* ( ⁓ → ~ ) SWUNG DASH → TILDE # +02DC ; 007E ; MA #* ( ˜ → ~ ) SMALL TILDE → TILDE # +1FC0 ; 007E ; MA #* ( ῀ → ~ ) GREEK PERISPOMENI → TILDE # →˜→ +223C ; 007E ; MA #* ( ∼ → ~ ) TILDE OPERATOR → TILDE # + +2368 ; 007E 0308 ; MA #* ( ⍨ → ~̈ ) APL FUNCTIONAL SYMBOL TILDE DIAERESIS → TILDE, COMBINING DIAERESIS # + +2E1E ; 007E 0307 ; MA #* ( ⸞ → ~̇ ) TILDE WITH DOT ABOVE → TILDE, COMBINING DOT ABOVE # →⩪→→∼̇→→⁓̇→ +2A6A ; 007E 0307 ; MA #* ( ⩪ → ~̇ ) TILDE OPERATOR WITH DOT ABOVE → TILDE, COMBINING DOT ABOVE # →∼̇→→⁓̇→ + +2E1F ; 007E 0323 ; MA #* ( ⸟ → ~̣ ) TILDE WITH DOT BELOW → TILDE, COMBINING DOT BELOW # + +1E8C8 ; 2220 ; MA #* ( ‎𞣈‎ → ∠ ) MENDE KIKAKUI DIGIT TWO → ANGLE # + +22C0 ; 2227 ; MA #* ( ⋀ → ∧ ) N-ARY LOGICAL AND → LOGICAL AND # + +222F ; 222E 222E ; MA #* ( ∯ → ∮∮ ) SURFACE INTEGRAL → CONTOUR INTEGRAL, CONTOUR INTEGRAL # + +2230 ; 222E 222E 222E ; MA #* ( ∰ → ∮∮∮ ) VOLUME INTEGRAL → CONTOUR INTEGRAL, CONTOUR INTEGRAL, CONTOUR INTEGRAL # + +2E2B ; 2234 ; MA #* ( ⸫ → ∴ ) ONE DOT OVER TWO DOTS PUNCTUATION → THEREFORE # + +2E2A ; 2235 ; MA #* ( ⸪ → ∵ ) TWO DOTS OVER ONE DOT PUNCTUATION → BECAUSE # + +2E2C ; 2237 ; MA #* ( ⸬ → ∷ ) SQUARED FOUR DOT PUNCTUATION → PROPORTION # + +111DE ; 2248 ; MA #* ( 𑇞 → ≈ ) SHARADA SECTION MARK-1 → ALMOST EQUAL TO # + +264E ; 224F ; MA #* ( ♎ → ≏ ) LIBRA → DIFFERENCE BETWEEN # +1F75E ; 224F ; MA #* ( 🝞 → ≏ ) ALCHEMICAL SYMBOL FOR SUBLIMATION → DIFFERENCE BETWEEN # →♎→ + +2263 ; 2261 ; MA #* ( ≣ → ≡ ) STRICTLY EQUIVALENT TO → IDENTICAL TO # +2CB7 ; 2261 ; MA # ( ⲷ → ≡ ) COPTIC SMALL LETTER CRYPTOGRAMMIC EIE → IDENTICAL TO # + +2A03 ; 228D ; MA #* ( ⨃ → ⊍ ) N-ARY UNION OPERATOR WITH DOT → MULTISET MULTIPLICATION # + +2A04 ; 228E ; MA #* ( ⨄ → ⊎ ) N-ARY UNION OPERATOR WITH PLUS → MULTISET UNION # + +1D238 ; 228F ; MA #* ( 𝈸 → ⊏ ) GREEK INSTRUMENTAL NOTATION SYMBOL-43 → SQUARE IMAGE OF # + +1D239 ; 2290 ; MA #* ( 𝈹 → ⊐ ) GREEK INSTRUMENTAL NOTATION SYMBOL-45 → SQUARE ORIGINAL OF # + +2A05 ; 2293 ; MA #* ( ⨅ → ⊓ ) N-ARY SQUARE INTERSECTION OPERATOR → SQUARE CAP # + +2A06 ; 2294 ; MA #* ( ⨆ → ⊔ ) N-ARY SQUARE UNION OPERATOR → SQUARE CUP # + +2A02 ; 2297 ; MA #* ( ⨂ → ⊗ ) N-ARY CIRCLED TIMES OPERATOR → CIRCLED TIMES # + +235F ; 229B ; MA #* ( ⍟ → ⊛ ) APL FUNCTIONAL SYMBOL CIRCLE STAR → CIRCLED ASTERISK OPERATOR # + +1F771 ; 22A0 ; MA #* ( 🝱 → ⊠ ) ALCHEMICAL SYMBOL FOR MONTH → SQUARED TIMES # + +1F755 ; 22A1 ; MA #* ( 🝕 → ⊡ ) ALCHEMICAL SYMBOL FOR URINE → SQUARED DOT OPERATOR # + +25C1 ; 22B2 ; MA #* ( ◁ → ⊲ ) WHITE LEFT-POINTING TRIANGLE → NORMAL SUBGROUP OF # + +25B7 ; 22B3 ; MA #* ( ▷ → ⊳ ) WHITE RIGHT-POINTING TRIANGLE → CONTAINS AS NORMAL SUBGROUP # + +2363 ; 22C6 0308 ; MA #* ( ⍣ → ⋆̈ ) APL FUNCTIONAL SYMBOL STAR DIAERESIS → STAR OPERATOR, COMBINING DIAERESIS # + +FE34 ; 2307 ; MA # ( ︴ → ⌇ ) PRESENTATION FORM FOR VERTICAL WAVY LOW LINE → WAVY LINE # + +25E0 ; 2312 ; MA #* ( ◠ → ⌒ ) UPPER HALF CIRCLE → ARC # + +2A3D ; 2319 ; MA #* ( ⨽ → ⌙ ) RIGHTHAND INTERIOR PRODUCT → TURNED NOT SIGN # + +2325 ; 2324 ; MA #* ( ⌥ → ⌤ ) OPTION KEY → UP ARROWHEAD BETWEEN TWO HORIZONTAL BARS # + +29C7 ; 233B ; MA #* ( ⧇ → ⌻ ) SQUARED SMALL CIRCLE → APL FUNCTIONAL SYMBOL QUAD JOT # + +25CE ; 233E ; MA #* ( ◎ → ⌾ ) BULLSEYE → APL FUNCTIONAL SYMBOL CIRCLE JOT # →⦾→ +29BE ; 233E ; MA #* ( ⦾ → ⌾ ) CIRCLED WHITE BULLET → APL FUNCTIONAL SYMBOL CIRCLE JOT # + +29C5 ; 2342 ; MA #* ( ⧅ → ⍂ ) SQUARED FALLING DIAGONAL SLASH → APL FUNCTIONAL SYMBOL QUAD BACKSLASH # + +29B0 ; 2349 ; MA #* ( ⦰ → ⍉ ) REVERSED EMPTY SET → APL FUNCTIONAL SYMBOL CIRCLE BACKSLASH # + +23C3 ; 234B ; MA #* ( ⏃ → ⍋ ) DENTISTRY SYMBOL LIGHT VERTICAL WITH TRIANGLE → APL FUNCTIONAL SYMBOL DELTA STILE # + +23C2 ; 234E ; MA #* ( ⏂ → ⍎ ) DENTISTRY SYMBOL LIGHT UP AND HORIZONTAL WITH CIRCLE → APL FUNCTIONAL SYMBOL DOWN TACK JOT # + +23C1 ; 2355 ; MA #* ( ⏁ → ⍕ ) DENTISTRY SYMBOL LIGHT DOWN AND HORIZONTAL WITH CIRCLE → APL FUNCTIONAL SYMBOL UP TACK JOT # + +23C6 ; 236D ; MA #* ( ⏆ → ⍭ ) DENTISTRY SYMBOL LIGHT VERTICAL AND WAVE → APL FUNCTIONAL SYMBOL STILE TILDE # + +2638 ; 2388 ; MA #* ( ☸ → ⎈ ) WHEEL OF DHARMA → HELM SYMBOL # + +FE35 ; 23DC ; MA #* ( ︵ → ⏜ ) PRESENTATION FORM FOR VERTICAL LEFT PARENTHESIS → TOP PARENTHESIS # + +FE36 ; 23DD ; MA #* ( ︶ → ⏝ ) PRESENTATION FORM FOR VERTICAL RIGHT PARENTHESIS → BOTTOM PARENTHESIS # + +FE37 ; 23DE ; MA #* ( ︷ → ⏞ ) PRESENTATION FORM FOR VERTICAL LEFT CURLY BRACKET → TOP CURLY BRACKET # + +FE38 ; 23DF ; MA #* ( ︸ → ⏟ ) PRESENTATION FORM FOR VERTICAL RIGHT CURLY BRACKET → BOTTOM CURLY BRACKET # + +FE39 ; 23E0 ; MA #* ( ︹ → ⏠ ) PRESENTATION FORM FOR VERTICAL LEFT TORTOISE SHELL BRACKET → TOP TORTOISE SHELL BRACKET # + +FE3A ; 23E1 ; MA #* ( ︺ → ⏡ ) PRESENTATION FORM FOR VERTICAL RIGHT TORTOISE SHELL BRACKET → BOTTOM TORTOISE SHELL BRACKET # + +25B1 ; 23E5 ; MA #* ( ▱ → ⏥ ) WHITE PARALLELOGRAM → FLATNESS # + +23FC ; 23FB ; MA #* ( ⏼ → ⏻ ) POWER ON-OFF SYMBOL → POWER SYMBOL # + +FE31 ; 2502 ; MA #* ( ︱ → │ ) PRESENTATION FORM FOR VERTICAL EM DASH → BOX DRAWINGS LIGHT VERTICAL # →|→ +FF5C ; 2502 ; MA #* ( | → │ ) FULLWIDTH VERTICAL LINE → BOX DRAWINGS LIGHT VERTICAL # +2503 ; 2502 ; MA #* ( ┃ → │ ) BOX DRAWINGS HEAVY VERTICAL → BOX DRAWINGS LIGHT VERTICAL # + +250F ; 250C ; MA #* ( ┏ → ┌ ) BOX DRAWINGS HEAVY DOWN AND RIGHT → BOX DRAWINGS LIGHT DOWN AND RIGHT # + +2523 ; 251C ; MA #* ( ┣ → ├ ) BOX DRAWINGS HEAVY VERTICAL AND RIGHT → BOX DRAWINGS LIGHT VERTICAL AND RIGHT # + +2590 ; 258C ; MA #* ( ▐ → ▌ ) RIGHT HALF BLOCK → LEFT HALF BLOCK # + +2597 ; 2596 ; MA #* ( ▗ → ▖ ) QUADRANT LOWER RIGHT → QUADRANT LOWER LEFT # + +259D ; 2598 ; MA #* ( ▝ → ▘ ) QUADRANT UPPER RIGHT → QUADRANT UPPER LEFT # + +2610 ; 25A1 ; MA #* ( ☐ → □ ) BALLOT BOX → WHITE SQUARE # + +FFED ; 25AA ; MA #* ( ■ → ▪ ) HALFWIDTH BLACK SQUARE → BLACK SMALL SQUARE # + +25B8 ; 25B6 ; MA #* ( ▸ → ▶ ) BLACK RIGHT-POINTING SMALL TRIANGLE → BLACK RIGHT-POINTING TRIANGLE # →►→ +25BA ; 25B6 ; MA #* ( ► → ▶ ) BLACK RIGHT-POINTING POINTER → BLACK RIGHT-POINTING TRIANGLE # + +2CE9 ; 2627 ; MA #* ( ⳩ → ☧ ) COPTIC SYMBOL KHI RO → CHI RHO # + +1F70A ; 2629 ; MA #* ( 🜊 → ☩ ) ALCHEMICAL SYMBOL FOR VINEGAR → CROSS OF JERUSALEM # + +1F312 ; 263D ; MA #* ( 🌒 → ☽ ) WAXING CRESCENT MOON SYMBOL → FIRST QUARTER MOON # +1F319 ; 263D ; MA #* ( 🌙 → ☽ ) CRESCENT MOON → FIRST QUARTER MOON # + +23FE ; 263E ; MA #* ( ⏾ → ☾ ) POWER SLEEP SYMBOL → LAST QUARTER MOON # +1F318 ; 263E ; MA #* ( 🌘 → ☾ ) WANING CRESCENT MOON SYMBOL → LAST QUARTER MOON # + +29D9 ; 299A ; MA #* ( ⧙ → ⦚ ) RIGHT WIGGLY FENCE → VERTICAL ZIGZAG LINE # + +1F73A ; 29DF ; MA #* ( 🜺 → ⧟ ) ALCHEMICAL SYMBOL FOR ARSENIC → DOUBLE-ENDED MULTIMAP # + +2A3E ; 2A1F ; MA #* ( ⨾ → ⨟ ) Z NOTATION RELATIONAL COMPOSITION → Z NOTATION SCHEMA COMPOSITION # + +2669 ; 1D158 1D165 ; MA #* ( ♩ → 𝅘𝅥 ) QUARTER NOTE → MUSICAL SYMBOL NOTEHEAD BLACK, MUSICAL SYMBOL COMBINING STEM # + +266A ; 1D158 1D165 1D16E ; MA #* ( ♪ → 𝅘𝅥𝅮 ) EIGHTH NOTE → MUSICAL SYMBOL NOTEHEAD BLACK, MUSICAL SYMBOL COMBINING STEM, MUSICAL SYMBOL COMBINING FLAG-1 # + +24EA ; 1F10D ; MA #* ( ⓪ → 🄍 ) CIRCLED DIGIT ZERO → CIRCLED ZERO WITH SLASH # + +21BA ; 1F10E ; MA #* ( ↺ → 🄎 ) ANTICLOCKWISE OPEN CIRCLE ARROW → CIRCLED ANTICLOCKWISE ARROW # + +1CCFB ; 1F6F8 ; MA #* ( 𜳻 → 🛸 ) FLYING SAUCER SYMBOL → FLYING SAUCER # + +02D9 ; 0971 ; MA #* ( ˙ → ॱ ) DOT ABOVE → DEVANAGARI SIGN HIGH SPACING DOT # +0D4E ; 0971 ; MA # ( ൎ → ॱ ) MALAYALAM LETTER DOT REPH → DEVANAGARI SIGN HIGH SPACING DOT # →˙→ + +FF0D ; 30FC ; MA #* ( - → ー ) FULLWIDTH HYPHEN-MINUS → KATAKANA-HIRAGANA PROLONGED SOUND MARK # +2014 ; 30FC ; MA #* ( — → ー ) EM DASH → KATAKANA-HIRAGANA PROLONGED SOUND MARK # →一→ +2015 ; 30FC ; MA #* ( ― → ー ) HORIZONTAL BAR → KATAKANA-HIRAGANA PROLONGED SOUND MARK # →—→→一→ +2500 ; 30FC ; MA #* ( ─ → ー ) BOX DRAWINGS LIGHT HORIZONTAL → KATAKANA-HIRAGANA PROLONGED SOUND MARK # →━→→—→→一→ +2501 ; 30FC ; MA #* ( ━ → ー ) BOX DRAWINGS HEAVY HORIZONTAL → KATAKANA-HIRAGANA PROLONGED SOUND MARK # →—→→一→ +31D0 ; 30FC ; MA #* ( ㇐ → ー ) CJK STROKE H → KATAKANA-HIRAGANA PROLONGED SOUND MARK # →一→ +A7F7 ; 30FC ; MA # ( ꟷ → ー ) LATIN EPIGRAPHIC LETTER SIDEWAYS I → KATAKANA-HIRAGANA PROLONGED SOUND MARK # →—→→一→ +1173 ; 30FC ; MA # ( ᅳ → ー ) HANGUL JUNGSEONG EU → KATAKANA-HIRAGANA PROLONGED SOUND MARK # →ㅡ→→—→→一→ +3161 ; 30FC ; MA # ( ㅡ → ー ) HANGUL LETTER EU → KATAKANA-HIRAGANA PROLONGED SOUND MARK # →—→→一→ +4E00 ; 30FC ; MA # ( 一 → ー ) CJK UNIFIED IDEOGRAPH-4E00 → KATAKANA-HIRAGANA PROLONGED SOUND MARK # +2F00 ; 30FC ; MA #* ( ⼀ → ー ) KANGXI RADICAL ONE → KATAKANA-HIRAGANA PROLONGED SOUND MARK # →一→ + +1196 ; 30FC 30FC ; MA # ( ᆖ → ーー ) HANGUL JUNGSEONG EU-EU → KATAKANA-HIRAGANA PROLONGED SOUND MARK, KATAKANA-HIRAGANA PROLONGED SOUND MARK # →ᅳᅳ→ + +D7B9 ; 30FC 1161 ; MA # ( ힹ → ーᅡ ) HANGUL JUNGSEONG EU-A → KATAKANA-HIRAGANA PROLONGED SOUND MARK, HANGUL JUNGSEONG A # →ᅳᅡ→ + +D7BA ; 30FC 1165 ; MA # ( ힺ → ーᅥ ) HANGUL JUNGSEONG EU-EO → KATAKANA-HIRAGANA PROLONGED SOUND MARK, HANGUL JUNGSEONG EO # →ᅳᅥ→ + +D7BB ; 30FC 1165 4E28 ; MA # ( ힻ → ーᅥ丨 ) HANGUL JUNGSEONG EU-E → KATAKANA-HIRAGANA PROLONGED SOUND MARK, HANGUL JUNGSEONG EO, CJK UNIFIED IDEOGRAPH-4E28 # →ᅳᅥᅵ→ + +D7BC ; 30FC 1169 ; MA # ( ힼ → ーᅩ ) HANGUL JUNGSEONG EU-O → KATAKANA-HIRAGANA PROLONGED SOUND MARK, HANGUL JUNGSEONG O # →ᅳᅩ→ + +1195 ; 30FC 116E ; MA # ( ᆕ → ーᅮ ) HANGUL JUNGSEONG EU-U → KATAKANA-HIRAGANA PROLONGED SOUND MARK, HANGUL JUNGSEONG U # →ᅳᅮ→ + +1174 ; 30FC 4E28 ; MA # ( ᅴ → ー丨 ) HANGUL JUNGSEONG YI → KATAKANA-HIRAGANA PROLONGED SOUND MARK, CJK UNIFIED IDEOGRAPH-4E28 # →ᅳᅵ→ +3162 ; 30FC 4E28 ; MA # ( ㅢ → ー丨 ) HANGUL LETTER YI → KATAKANA-HIRAGANA PROLONGED SOUND MARK, CJK UNIFIED IDEOGRAPH-4E28 # →ᅴ→→ᅳᅵ→ + +1197 ; 30FC 4E28 116E ; MA # ( ᆗ → ー丨ᅮ ) HANGUL JUNGSEONG YI-U → KATAKANA-HIRAGANA PROLONGED SOUND MARK, CJK UNIFIED IDEOGRAPH-4E28, HANGUL JUNGSEONG U # →ᅳᅵᅮ→ + +1F10F ; 0024 20E0 ; MA #* ( 🄏 → $⃠ ) CIRCLED DOLLAR SIGN WITH OVERLAID BACKSLASH → DOLLAR SIGN, COMBINING ENCLOSING CIRCLE BACKSLASH # + +20A4 ; 00A3 ; MA #* ( ₤ → £ ) LIRA SIGN → POUND SIGN # + +3012 ; 20B8 ; MA #* ( 〒 → ₸ ) POSTAL MARK → TENGE SIGN # +3036 ; 20B8 ; MA #* ( 〶 → ₸ ) CIRCLED POSTAL MARK → TENGE SIGN # →〒→ + +1B5C ; 1B50 ; MA #* ( ᭜ → ᭐ ) BALINESE WINDU → BALINESE DIGIT ZERO # + +A9C6 ; A9D0 ; MA #* ( ꧆ → ꧐ ) JAVANESE PADA WINDU → JAVANESE DIGIT ZERO # + +114D1 ; 09E7 ; MA # ( 𑓑 → ১ ) TIRHUTA DIGIT ONE → BENGALI DIGIT ONE # + +0CE7 ; 0C67 ; MA # ( ೧ → ౧ ) KANNADA DIGIT ONE → TELUGU DIGIT ONE # + +1065 ; 1041 ; MA # ( ၥ → ၁ ) MYANMAR LETTER WESTERN PWO KAREN THA → MYANMAR DIGIT ONE # + +2460 ; 2780 ; MA #* ( ① → ➀ ) CIRCLED DIGIT ONE → DINGBAT CIRCLED SANS-SERIF DIGIT ONE # + +2469 ; 2789 ; MA #* ( ⑩ → ➉ ) CIRCLED NUMBER TEN → DINGBAT CIRCLED SANS-SERIF NUMBER TEN # + +23E8 ; 2081 2080 ; MA #* ( ⏨ → ₁₀ ) DECIMAL EXPONENT SYMBOL → SUBSCRIPT ONE, SUBSCRIPT ZERO # + +1CCF2 ; 0032 ; MA # ( 𜳲 → 2 ) OUTLINED DIGIT TWO → DIGIT TWO # +1D7D0 ; 0032 ; MA # ( 𝟐 → 2 ) MATHEMATICAL BOLD DIGIT TWO → DIGIT TWO # +1D7DA ; 0032 ; MA # ( 𝟚 → 2 ) MATHEMATICAL DOUBLE-STRUCK DIGIT TWO → DIGIT TWO # +1D7E4 ; 0032 ; MA # ( 𝟤 → 2 ) MATHEMATICAL SANS-SERIF DIGIT TWO → DIGIT TWO # +1D7EE ; 0032 ; MA # ( 𝟮 → 2 ) MATHEMATICAL SANS-SERIF BOLD DIGIT TWO → DIGIT TWO # +1D7F8 ; 0032 ; MA # ( 𝟸 → 2 ) MATHEMATICAL MONOSPACE DIGIT TWO → DIGIT TWO # +1FBF2 ; 0032 ; MA # ( 🯲 → 2 ) SEGMENTED DIGIT TWO → DIGIT TWO # +A75A ; 0032 ; MA # ( Ꝛ → 2 ) LATIN CAPITAL LETTER R ROTUNDA → DIGIT TWO # +01A7 ; 0032 ; MA # ( Ƨ → 2 ) LATIN CAPITAL LETTER TONE TWO → DIGIT TWO # +03E8 ; 0032 ; MA # ( Ϩ → 2 ) COPTIC CAPITAL LETTER HORI → DIGIT TWO # →Ƨ→ +A644 ; 0032 ; MA # ( Ꙅ → 2 ) CYRILLIC CAPITAL LETTER REVERSED DZE → DIGIT TWO # →Ƨ→ +14BF ; 0032 ; MA # ( ᒿ → 2 ) CANADIAN SYLLABICS SAYISI M → DIGIT TWO # +A6EF ; 0032 ; MA # ( ꛯ → 2 ) BAMUM LETTER KOGHOM → DIGIT TWO # →Ƨ→ + +A9CF ; 0662 ; MA # ( ꧏ → ‎٢‎ ) JAVANESE PANGRANGKEP → ARABIC-INDIC DIGIT TWO # +06F2 ; 0662 ; MA # ( ۲ → ‎٢‎ ) EXTENDED ARABIC-INDIC DIGIT TWO → ARABIC-INDIC DIGIT TWO # + +0AE8 ; 0968 ; MA # ( ૨ → २ ) GUJARATI DIGIT TWO → DEVANAGARI DIGIT TWO # +0AB0 ; 0968 ; MA # ( ર → २ ) GUJARATI LETTER RA → DEVANAGARI DIGIT TWO # →૨→ + +114D2 ; 09E8 ; MA # ( 𑓒 → ২ ) TIRHUTA DIGIT TWO → BENGALI DIGIT TWO # + +0CE8 ; 0C68 ; MA # ( ೨ → ౨ ) KANNADA DIGIT TWO → TELUGU DIGIT TWO # + +2461 ; 2781 ; MA #* ( ② → ➁ ) CIRCLED DIGIT TWO → DINGBAT CIRCLED SANS-SERIF DIGIT TWO # + +01BB ; 0032 0335 ; MA # ( ƻ → 2̵ ) LATIN LETTER TWO WITH STROKE → DIGIT TWO, COMBINING SHORT STROKE OVERLAY # + +1F103 ; 0032 002C ; MA #* ( 🄃 → 2, ) DIGIT TWO COMMA → DIGIT TWO, COMMA # + +2489 ; 0032 002E ; MA #* ( ⒉ → 2. ) DIGIT TWO FULL STOP → DIGIT TWO, FULL STOP # + +33F5 ; 0032 0032 65E5 ; MA #* ( ㏵ → 22日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-TWO → DIGIT TWO, DIGIT TWO, CJK UNIFIED IDEOGRAPH-65E5 # + +336E ; 0032 0032 70B9 ; MA #* ( ㍮ → 22点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TWENTY-TWO → DIGIT TWO, DIGIT TWO, CJK UNIFIED IDEOGRAPH-70B9 # + +33F6 ; 0032 0033 65E5 ; MA #* ( ㏶ → 23日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-THREE → DIGIT TWO, DIGIT THREE, CJK UNIFIED IDEOGRAPH-65E5 # + +336F ; 0032 0033 70B9 ; MA #* ( ㍯ → 23点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TWENTY-THREE → DIGIT TWO, DIGIT THREE, CJK UNIFIED IDEOGRAPH-70B9 # + +33F7 ; 0032 0034 65E5 ; MA #* ( ㏷ → 24日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-FOUR → DIGIT TWO, DIGIT FOUR, CJK UNIFIED IDEOGRAPH-65E5 # + +3370 ; 0032 0034 70B9 ; MA #* ( ㍰ → 24点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TWENTY-FOUR → DIGIT TWO, DIGIT FOUR, CJK UNIFIED IDEOGRAPH-70B9 # + +33F8 ; 0032 0035 65E5 ; MA #* ( ㏸ → 25日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-FIVE → DIGIT TWO, DIGIT FIVE, CJK UNIFIED IDEOGRAPH-65E5 # + +33F9 ; 0032 0036 65E5 ; MA #* ( ㏹ → 26日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-SIX → DIGIT TWO, DIGIT SIX, CJK UNIFIED IDEOGRAPH-65E5 # + +33FA ; 0032 0037 65E5 ; MA #* ( ㏺ → 27日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-SEVEN → DIGIT TWO, DIGIT SEVEN, CJK UNIFIED IDEOGRAPH-65E5 # + +33FB ; 0032 0038 65E5 ; MA #* ( ㏻ → 28日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-EIGHT → DIGIT TWO, DIGIT EIGHT, CJK UNIFIED IDEOGRAPH-65E5 # + +33FC ; 0032 0039 65E5 ; MA #* ( ㏼ → 29日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-NINE → DIGIT TWO, DIGIT NINE, CJK UNIFIED IDEOGRAPH-65E5 # + +33F4 ; 0032 006C 65E5 ; MA #* ( ㏴ → 2l日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-ONE → DIGIT TWO, LATIN SMALL LETTER L, CJK UNIFIED IDEOGRAPH-65E5 # →21日→ + +336D ; 0032 006C 70B9 ; MA #* ( ㍭ → 2l点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TWENTY-ONE → DIGIT TWO, LATIN SMALL LETTER L, CJK UNIFIED IDEOGRAPH-70B9 # →21点→ + +249B ; 0032 004F 002E ; MA #* ( ⒛ → 2O. ) NUMBER TWENTY FULL STOP → DIGIT TWO, LATIN CAPITAL LETTER O, FULL STOP # →20.→ + +33F3 ; 0032 004F 65E5 ; MA #* ( ㏳ → 2O日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY → DIGIT TWO, LATIN CAPITAL LETTER O, CJK UNIFIED IDEOGRAPH-65E5 # →20日→ + +336C ; 0032 004F 70B9 ; MA #* ( ㍬ → 2O点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TWENTY → DIGIT TWO, LATIN CAPITAL LETTER O, CJK UNIFIED IDEOGRAPH-70B9 # →20点→ + +0DE9 ; 0DE8 0DCF ; MA # ( ෩ → ෨ා ) SINHALA LITH DIGIT THREE → SINHALA LITH DIGIT TWO, SINHALA VOWEL SIGN AELA-PILLA # + +0DEF ; 0DE8 0DD3 ; MA # ( ෯ → ෨ී ) SINHALA LITH DIGIT NINE → SINHALA LITH DIGIT TWO, SINHALA VOWEL SIGN DIGA IS-PILLA # + +33E1 ; 0032 65E5 ; MA #* ( ㏡ → 2日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWO → DIGIT TWO, CJK UNIFIED IDEOGRAPH-65E5 # + +32C1 ; 0032 6708 ; MA #* ( ㋁ → 2月 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR FEBRUARY → DIGIT TWO, CJK UNIFIED IDEOGRAPH-6708 # + +335A ; 0032 70B9 ; MA #* ( ㍚ → 2点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TWO → DIGIT TWO, CJK UNIFIED IDEOGRAPH-70B9 # + +1D206 ; 0033 ; MA #* ( 𝈆 → 3 ) GREEK VOCAL NOTATION SYMBOL-7 → DIGIT THREE # +0969 ; 0033 ; MA # ( ३ → 3 ) DEVANAGARI DIGIT THREE → DIGIT THREE # →૩→ +0AE9 ; 0033 ; MA # ( ૩ → 3 ) GUJARATI DIGIT THREE → DIGIT THREE # +1CCF3 ; 0033 ; MA # ( 𜳳 → 3 ) OUTLINED DIGIT THREE → DIGIT THREE # +1D7D1 ; 0033 ; MA # ( 𝟑 → 3 ) MATHEMATICAL BOLD DIGIT THREE → DIGIT THREE # +1D7DB ; 0033 ; MA # ( 𝟛 → 3 ) MATHEMATICAL DOUBLE-STRUCK DIGIT THREE → DIGIT THREE # +1D7E5 ; 0033 ; MA # ( 𝟥 → 3 ) MATHEMATICAL SANS-SERIF DIGIT THREE → DIGIT THREE # +1D7EF ; 0033 ; MA # ( 𝟯 → 3 ) MATHEMATICAL SANS-SERIF BOLD DIGIT THREE → DIGIT THREE # +1D7F9 ; 0033 ; MA # ( 𝟹 → 3 ) MATHEMATICAL MONOSPACE DIGIT THREE → DIGIT THREE # +1FBF3 ; 0033 ; MA # ( 🯳 → 3 ) SEGMENTED DIGIT THREE → DIGIT THREE # +A7AB ; 0033 ; MA # ( Ɜ → 3 ) LATIN CAPITAL LETTER REVERSED OPEN E → DIGIT THREE # +021C ; 0033 ; MA # ( Ȝ → 3 ) LATIN CAPITAL LETTER YOGH → DIGIT THREE # →Ʒ→ +01B7 ; 0033 ; MA # ( Ʒ → 3 ) LATIN CAPITAL LETTER EZH → DIGIT THREE # +A76A ; 0033 ; MA # ( Ꝫ → 3 ) LATIN CAPITAL LETTER ET → DIGIT THREE # +2C9C ; 0033 ; MA # ( Ⲝ → 3 ) COPTIC CAPITAL LETTER KSI → DIGIT THREE # →Ʒ→ +2CC4 ; 0033 ; MA # ( Ⳅ → 3 ) COPTIC CAPITAL LETTER OLD COPTIC SHEI → DIGIT THREE # →Ʒ→ +2CCC ; 0033 ; MA # ( Ⳍ → 3 ) COPTIC CAPITAL LETTER OLD COPTIC HORI → DIGIT THREE # →Ȝ→→Ʒ→ +0417 ; 0033 ; MA # ( З → 3 ) CYRILLIC CAPITAL LETTER ZE → DIGIT THREE # +04E0 ; 0033 ; MA # ( Ӡ → 3 ) CYRILLIC CAPITAL LETTER ABKHASIAN DZE → DIGIT THREE # →Ʒ→ +16F3B ; 0033 ; MA # ( 𖼻 → 3 ) MIAO LETTER ZA → DIGIT THREE # →Ʒ→ +118CA ; 0033 ; MA # ( 𑣊 → 3 ) WARANG CITI SMALL LETTER ANG → DIGIT THREE # + +06F3 ; 0663 ; MA # ( ۳ → ‎٣‎ ) EXTENDED ARABIC-INDIC DIGIT THREE → ARABIC-INDIC DIGIT THREE # +1E8C9 ; 0663 ; MA #* ( ‎𞣉‎ → ‎٣‎ ) MENDE KIKAKUI DIGIT THREE → ARABIC-INDIC DIGIT THREE # + +2462 ; 2782 ; MA #* ( ③ → ➂ ) CIRCLED DIGIT THREE → DINGBAT CIRCLED SANS-SERIF DIGIT THREE # + +0498 ; 0033 0326 ; MA # ( Ҙ → 3̦ ) CYRILLIC CAPITAL LETTER ZE WITH DESCENDER → DIGIT THREE, COMBINING COMMA BELOW # →З̧→ + +1F104 ; 0033 002C ; MA #* ( 🄄 → 3, ) DIGIT THREE COMMA → DIGIT THREE, COMMA # + +248A ; 0033 002E ; MA #* ( ⒊ → 3. ) DIGIT THREE FULL STOP → DIGIT THREE, FULL STOP # + +33FE ; 0033 006C 65E5 ; MA #* ( ㏾ → 3l日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY THIRTY-ONE → DIGIT THREE, LATIN SMALL LETTER L, CJK UNIFIED IDEOGRAPH-65E5 # →31日→ + +33FD ; 0033 004F 65E5 ; MA #* ( ㏽ → 3O日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY THIRTY → DIGIT THREE, LATIN CAPITAL LETTER O, CJK UNIFIED IDEOGRAPH-65E5 # →30日→ + +33E2 ; 0033 65E5 ; MA #* ( ㏢ → 3日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY THREE → DIGIT THREE, CJK UNIFIED IDEOGRAPH-65E5 # + +32C2 ; 0033 6708 ; MA #* ( ㋂ → 3月 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR MARCH → DIGIT THREE, CJK UNIFIED IDEOGRAPH-6708 # + +335B ; 0033 70B9 ; MA #* ( ㍛ → 3点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR THREE → DIGIT THREE, CJK UNIFIED IDEOGRAPH-70B9 # + +1CCF4 ; 0034 ; MA # ( 𜳴 → 4 ) OUTLINED DIGIT FOUR → DIGIT FOUR # +1D7D2 ; 0034 ; MA # ( 𝟒 → 4 ) MATHEMATICAL BOLD DIGIT FOUR → DIGIT FOUR # +1D7DC ; 0034 ; MA # ( 𝟜 → 4 ) MATHEMATICAL DOUBLE-STRUCK DIGIT FOUR → DIGIT FOUR # +1D7E6 ; 0034 ; MA # ( 𝟦 → 4 ) MATHEMATICAL SANS-SERIF DIGIT FOUR → DIGIT FOUR # +1D7F0 ; 0034 ; MA # ( 𝟰 → 4 ) MATHEMATICAL SANS-SERIF BOLD DIGIT FOUR → DIGIT FOUR # +1D7FA ; 0034 ; MA # ( 𝟺 → 4 ) MATHEMATICAL MONOSPACE DIGIT FOUR → DIGIT FOUR # +1FBF4 ; 0034 ; MA # ( 🯴 → 4 ) SEGMENTED DIGIT FOUR → DIGIT FOUR # +13CE ; 0034 ; MA # ( Ꮞ → 4 ) CHEROKEE LETTER SE → DIGIT FOUR # +118AF ; 0034 ; MA # ( 𑢯 → 4 ) WARANG CITI CAPITAL LETTER UC → DIGIT FOUR # + +06F4 ; 0664 ; MA # ( ۴ → ‎٤‎ ) EXTENDED ARABIC-INDIC DIGIT FOUR → ARABIC-INDIC DIGIT FOUR # + +0AEA ; 096A ; MA # ( ૪ → ४ ) GUJARATI DIGIT FOUR → DEVANAGARI DIGIT FOUR # + +2463 ; 2783 ; MA #* ( ④ → ➃ ) CIRCLED DIGIT FOUR → DINGBAT CIRCLED SANS-SERIF DIGIT FOUR # + +1F105 ; 0034 002C ; MA #* ( 🄅 → 4, ) DIGIT FOUR COMMA → DIGIT FOUR, COMMA # + +248B ; 0034 002E ; MA #* ( ⒋ → 4. ) DIGIT FOUR FULL STOP → DIGIT FOUR, FULL STOP # + +1530 ; 0034 00B7 ; MA # ( ᔰ → 4· ) CANADIAN SYLLABICS WEST-CREE YWE → DIGIT FOUR, MIDDLE DOT # →4ᐧ→ + +33E3 ; 0034 65E5 ; MA #* ( ㏣ → 4日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY FOUR → DIGIT FOUR, CJK UNIFIED IDEOGRAPH-65E5 # + +32C3 ; 0034 6708 ; MA #* ( ㋃ → 4月 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR APRIL → DIGIT FOUR, CJK UNIFIED IDEOGRAPH-6708 # + +335C ; 0034 70B9 ; MA #* ( ㍜ → 4点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR FOUR → DIGIT FOUR, CJK UNIFIED IDEOGRAPH-70B9 # + +1CCF5 ; 0035 ; MA # ( 𜳵 → 5 ) OUTLINED DIGIT FIVE → DIGIT FIVE # +1D7D3 ; 0035 ; MA # ( 𝟓 → 5 ) MATHEMATICAL BOLD DIGIT FIVE → DIGIT FIVE # +1D7DD ; 0035 ; MA # ( 𝟝 → 5 ) MATHEMATICAL DOUBLE-STRUCK DIGIT FIVE → DIGIT FIVE # +1D7E7 ; 0035 ; MA # ( 𝟧 → 5 ) MATHEMATICAL SANS-SERIF DIGIT FIVE → DIGIT FIVE # +1D7F1 ; 0035 ; MA # ( 𝟱 → 5 ) MATHEMATICAL SANS-SERIF BOLD DIGIT FIVE → DIGIT FIVE # +1D7FB ; 0035 ; MA # ( 𝟻 → 5 ) MATHEMATICAL MONOSPACE DIGIT FIVE → DIGIT FIVE # +1FBF5 ; 0035 ; MA # ( 🯵 → 5 ) SEGMENTED DIGIT FIVE → DIGIT FIVE # +01BC ; 0035 ; MA # ( Ƽ → 5 ) LATIN CAPITAL LETTER TONE FIVE → DIGIT FIVE # +118BB ; 0035 ; MA # ( 𑢻 → 5 ) WARANG CITI CAPITAL LETTER HORR → DIGIT FIVE # + +2464 ; 2784 ; MA #* ( ⑤ → ➄ ) CIRCLED DIGIT FIVE → DINGBAT CIRCLED SANS-SERIF DIGIT FIVE # + +1F106 ; 0035 002C ; MA #* ( 🄆 → 5, ) DIGIT FIVE COMMA → DIGIT FIVE, COMMA # + +248C ; 0035 002E ; MA #* ( ⒌ → 5. ) DIGIT FIVE FULL STOP → DIGIT FIVE, FULL STOP # + +33E4 ; 0035 65E5 ; MA #* ( ㏤ → 5日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY FIVE → DIGIT FIVE, CJK UNIFIED IDEOGRAPH-65E5 # + +32C4 ; 0035 6708 ; MA #* ( ㋄ → 5月 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR MAY → DIGIT FIVE, CJK UNIFIED IDEOGRAPH-6708 # + +335D ; 0035 70B9 ; MA #* ( ㍝ → 5点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR FIVE → DIGIT FIVE, CJK UNIFIED IDEOGRAPH-70B9 # + +1CCF6 ; 0036 ; MA # ( 𜳶 → 6 ) OUTLINED DIGIT SIX → DIGIT SIX # +1D7D4 ; 0036 ; MA # ( 𝟔 → 6 ) MATHEMATICAL BOLD DIGIT SIX → DIGIT SIX # +1D7DE ; 0036 ; MA # ( 𝟞 → 6 ) MATHEMATICAL DOUBLE-STRUCK DIGIT SIX → DIGIT SIX # +1D7E8 ; 0036 ; MA # ( 𝟨 → 6 ) MATHEMATICAL SANS-SERIF DIGIT SIX → DIGIT SIX # +1D7F2 ; 0036 ; MA # ( 𝟲 → 6 ) MATHEMATICAL SANS-SERIF BOLD DIGIT SIX → DIGIT SIX # +1D7FC ; 0036 ; MA # ( 𝟼 → 6 ) MATHEMATICAL MONOSPACE DIGIT SIX → DIGIT SIX # +1FBF6 ; 0036 ; MA # ( 🯶 → 6 ) SEGMENTED DIGIT SIX → DIGIT SIX # +2CD3 ; 0036 ; MA # ( ⳓ → 6 ) COPTIC SMALL LETTER OLD COPTIC HEI → DIGIT SIX # +2CD2 ; 0036 ; MA # ( Ⳓ → 6 ) COPTIC CAPITAL LETTER OLD COPTIC HEI → DIGIT SIX # +03EC ; 0036 ; MA # ( Ϭ → 6 ) COPTIC CAPITAL LETTER SHIMA → DIGIT SIX # +2CDC ; 0036 ; MA # ( Ⳝ → 6 ) COPTIC CAPITAL LETTER OLD NUBIAN SHIMA → DIGIT SIX # →Ϭ→ +0431 ; 0036 ; MA # ( б → 6 ) CYRILLIC SMALL LETTER BE → DIGIT SIX # +13EE ; 0036 ; MA # ( Ꮾ → 6 ) CHEROKEE LETTER WV → DIGIT SIX # +118D5 ; 0036 ; MA # ( 𑣕 → 6 ) WARANG CITI SMALL LETTER AT → DIGIT SIX # + +06F6 ; 0666 ; MA # ( ۶ → ‎٦‎ ) EXTENDED ARABIC-INDIC DIGIT SIX → ARABIC-INDIC DIGIT SIX # + +114D6 ; 09EC ; MA # ( 𑓖 → ৬ ) TIRHUTA DIGIT SIX → BENGALI DIGIT SIX # + +2465 ; 2785 ; MA #* ( ⑥ → ➅ ) CIRCLED DIGIT SIX → DINGBAT CIRCLED SANS-SERIF DIGIT SIX # + +1F107 ; 0036 002C ; MA #* ( 🄇 → 6, ) DIGIT SIX COMMA → DIGIT SIX, COMMA # + +248D ; 0036 002E ; MA #* ( ⒍ → 6. ) DIGIT SIX FULL STOP → DIGIT SIX, FULL STOP # + +33E5 ; 0036 65E5 ; MA #* ( ㏥ → 6日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY SIX → DIGIT SIX, CJK UNIFIED IDEOGRAPH-65E5 # + +32C5 ; 0036 6708 ; MA #* ( ㋅ → 6月 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR JUNE → DIGIT SIX, CJK UNIFIED IDEOGRAPH-6708 # + +335E ; 0036 70B9 ; MA #* ( ㍞ → 6点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR SIX → DIGIT SIX, CJK UNIFIED IDEOGRAPH-70B9 # + +1D212 ; 0037 ; MA #* ( 𝈒 → 7 ) GREEK VOCAL NOTATION SYMBOL-19 → DIGIT SEVEN # +1CCF7 ; 0037 ; MA # ( 𜳷 → 7 ) OUTLINED DIGIT SEVEN → DIGIT SEVEN # +1D7D5 ; 0037 ; MA # ( 𝟕 → 7 ) MATHEMATICAL BOLD DIGIT SEVEN → DIGIT SEVEN # +1D7DF ; 0037 ; MA # ( 𝟟 → 7 ) MATHEMATICAL DOUBLE-STRUCK DIGIT SEVEN → DIGIT SEVEN # +1D7E9 ; 0037 ; MA # ( 𝟩 → 7 ) MATHEMATICAL SANS-SERIF DIGIT SEVEN → DIGIT SEVEN # +1D7F3 ; 0037 ; MA # ( 𝟳 → 7 ) MATHEMATICAL SANS-SERIF BOLD DIGIT SEVEN → DIGIT SEVEN # +1D7FD ; 0037 ; MA # ( 𝟽 → 7 ) MATHEMATICAL MONOSPACE DIGIT SEVEN → DIGIT SEVEN # +1FBF7 ; 0037 ; MA # ( 🯷 → 7 ) SEGMENTED DIGIT SEVEN → DIGIT SEVEN # +104D2 ; 0037 ; MA # ( 𐓒 → 7 ) OSAGE CAPITAL LETTER ZA → DIGIT SEVEN # +118C6 ; 0037 ; MA # ( 𑣆 → 7 ) WARANG CITI SMALL LETTER II → DIGIT SEVEN # + +2466 ; 2786 ; MA #* ( ⑦ → ➆ ) CIRCLED DIGIT SEVEN → DINGBAT CIRCLED SANS-SERIF DIGIT SEVEN # + +1F108 ; 0037 002C ; MA #* ( 🄈 → 7, ) DIGIT SEVEN COMMA → DIGIT SEVEN, COMMA # + +248E ; 0037 002E ; MA #* ( ⒎ → 7. ) DIGIT SEVEN FULL STOP → DIGIT SEVEN, FULL STOP # + +33E6 ; 0037 65E5 ; MA #* ( ㏦ → 7日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY SEVEN → DIGIT SEVEN, CJK UNIFIED IDEOGRAPH-65E5 # + +32C6 ; 0037 6708 ; MA #* ( ㋆ → 7月 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR JULY → DIGIT SEVEN, CJK UNIFIED IDEOGRAPH-6708 # + +335F ; 0037 70B9 ; MA #* ( ㍟ → 7点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR SEVEN → DIGIT SEVEN, CJK UNIFIED IDEOGRAPH-70B9 # + +0B03 ; 0038 ; MA # ( ଃ → 8 ) ORIYA SIGN VISARGA → DIGIT EIGHT # +09EA ; 0038 ; MA # ( ৪ → 8 ) BENGALI DIGIT FOUR → DIGIT EIGHT # +0A6A ; 0038 ; MA # ( ੪ → 8 ) GURMUKHI DIGIT FOUR → DIGIT EIGHT # +1E8CB ; 0038 ; MA #* ( ‎𞣋‎ → 8 ) MENDE KIKAKUI DIGIT FIVE → DIGIT EIGHT # +1CCF8 ; 0038 ; MA # ( 𜳸 → 8 ) OUTLINED DIGIT EIGHT → DIGIT EIGHT # +1D7D6 ; 0038 ; MA # ( 𝟖 → 8 ) MATHEMATICAL BOLD DIGIT EIGHT → DIGIT EIGHT # +1D7E0 ; 0038 ; MA # ( 𝟠 → 8 ) MATHEMATICAL DOUBLE-STRUCK DIGIT EIGHT → DIGIT EIGHT # +1D7EA ; 0038 ; MA # ( 𝟪 → 8 ) MATHEMATICAL SANS-SERIF DIGIT EIGHT → DIGIT EIGHT # +1D7F4 ; 0038 ; MA # ( 𝟴 → 8 ) MATHEMATICAL SANS-SERIF BOLD DIGIT EIGHT → DIGIT EIGHT # +1D7FE ; 0038 ; MA # ( 𝟾 → 8 ) MATHEMATICAL MONOSPACE DIGIT EIGHT → DIGIT EIGHT # +1FBF8 ; 0038 ; MA # ( 🯸 → 8 ) SEGMENTED DIGIT EIGHT → DIGIT EIGHT # +0223 ; 0038 ; MA # ( ȣ → 8 ) LATIN SMALL LETTER OU → DIGIT EIGHT # +0222 ; 0038 ; MA # ( Ȣ → 8 ) LATIN CAPITAL LETTER OU → DIGIT EIGHT # +1031A ; 0038 ; MA # ( 𐌚 → 8 ) OLD ITALIC LETTER EF → DIGIT EIGHT # + +0AEE ; 096E ; MA # ( ૮ → ८ ) GUJARATI DIGIT EIGHT → DEVANAGARI DIGIT EIGHT # + +2467 ; 2787 ; MA #* ( ⑧ → ➇ ) CIRCLED DIGIT EIGHT → DINGBAT CIRCLED SANS-SERIF DIGIT EIGHT # + +1F109 ; 0038 002C ; MA #* ( 🄉 → 8, ) DIGIT EIGHT COMMA → DIGIT EIGHT, COMMA # + +248F ; 0038 002E ; MA #* ( ⒏ → 8. ) DIGIT EIGHT FULL STOP → DIGIT EIGHT, FULL STOP # + +33E7 ; 0038 65E5 ; MA #* ( ㏧ → 8日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY EIGHT → DIGIT EIGHT, CJK UNIFIED IDEOGRAPH-65E5 # + +32C7 ; 0038 6708 ; MA #* ( ㋇ → 8月 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR AUGUST → DIGIT EIGHT, CJK UNIFIED IDEOGRAPH-6708 # + +3360 ; 0038 70B9 ; MA #* ( ㍠ → 8点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR EIGHT → DIGIT EIGHT, CJK UNIFIED IDEOGRAPH-70B9 # + +0A67 ; 0039 ; MA # ( ੧ → 9 ) GURMUKHI DIGIT ONE → DIGIT NINE # +0B68 ; 0039 ; MA # ( ୨ → 9 ) ORIYA DIGIT TWO → DIGIT NINE # +09ED ; 0039 ; MA # ( ৭ → 9 ) BENGALI DIGIT SEVEN → DIGIT NINE # +0D6D ; 0039 ; MA # ( ൭ → 9 ) MALAYALAM DIGIT SEVEN → DIGIT NINE # +1CCF9 ; 0039 ; MA # ( 𜳹 → 9 ) OUTLINED DIGIT NINE → DIGIT NINE # +1D7D7 ; 0039 ; MA # ( 𝟗 → 9 ) MATHEMATICAL BOLD DIGIT NINE → DIGIT NINE # +1D7E1 ; 0039 ; MA # ( 𝟡 → 9 ) MATHEMATICAL DOUBLE-STRUCK DIGIT NINE → DIGIT NINE # +1D7EB ; 0039 ; MA # ( 𝟫 → 9 ) MATHEMATICAL SANS-SERIF DIGIT NINE → DIGIT NINE # +1D7F5 ; 0039 ; MA # ( 𝟵 → 9 ) MATHEMATICAL SANS-SERIF BOLD DIGIT NINE → DIGIT NINE # +1D7FF ; 0039 ; MA # ( 𝟿 → 9 ) MATHEMATICAL MONOSPACE DIGIT NINE → DIGIT NINE # +1FBF9 ; 0039 ; MA # ( 🯹 → 9 ) SEGMENTED DIGIT NINE → DIGIT NINE # +A76E ; 0039 ; MA # ( Ꝯ → 9 ) LATIN CAPITAL LETTER CON → DIGIT NINE # +2CCB ; 0039 ; MA # ( ⳋ → 9 ) COPTIC SMALL LETTER DIALECT-P HORI → DIGIT NINE # +2CCA ; 0039 ; MA # ( Ⳋ → 9 ) COPTIC CAPITAL LETTER DIALECT-P HORI → DIGIT NINE # +118CC ; 0039 ; MA # ( 𑣌 → 9 ) WARANG CITI SMALL LETTER KO → DIGIT NINE # +118AC ; 0039 ; MA # ( 𑢬 → 9 ) WARANG CITI CAPITAL LETTER KO → DIGIT NINE # +118D6 ; 0039 ; MA # ( 𑣖 → 9 ) WARANG CITI SMALL LETTER AM → DIGIT NINE # + +0967 ; 0669 ; MA # ( १ → ‎٩‎ ) DEVANAGARI DIGIT ONE → ARABIC-INDIC DIGIT NINE # +118E4 ; 0669 ; MA # ( 𑣤 → ‎٩‎ ) WARANG CITI DIGIT FOUR → ARABIC-INDIC DIGIT NINE # +06F9 ; 0669 ; MA # ( ۹ → ‎٩‎ ) EXTENDED ARABIC-INDIC DIGIT NINE → ARABIC-INDIC DIGIT NINE # + +0CEF ; 0C6F ; MA # ( ೯ → ౯ ) KANNADA DIGIT NINE → TELUGU DIGIT NINE # + +2468 ; 2788 ; MA #* ( ⑨ → ➈ ) CIRCLED DIGIT NINE → DINGBAT CIRCLED SANS-SERIF DIGIT NINE # + +1F10A ; 0039 002C ; MA #* ( 🄊 → 9, ) DIGIT NINE COMMA → DIGIT NINE, COMMA # + +2490 ; 0039 002E ; MA #* ( ⒐ → 9. ) DIGIT NINE FULL STOP → DIGIT NINE, FULL STOP # + +33E8 ; 0039 65E5 ; MA #* ( ㏨ → 9日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY NINE → DIGIT NINE, CJK UNIFIED IDEOGRAPH-65E5 # + +32C8 ; 0039 6708 ; MA #* ( ㋈ → 9月 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR SEPTEMBER → DIGIT NINE, CJK UNIFIED IDEOGRAPH-6708 # + +3361 ; 0039 70B9 ; MA #* ( ㍡ → 9点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR NINE → DIGIT NINE, CJK UNIFIED IDEOGRAPH-70B9 # + +237A ; 0061 ; MA #* ( ⍺ → a ) APL FUNCTIONAL SYMBOL ALPHA → LATIN SMALL LETTER A # →α→ +FF41 ; 0061 ; MA # ( a → a ) FULLWIDTH LATIN SMALL LETTER A → LATIN SMALL LETTER A # →а→ +1D41A ; 0061 ; MA # ( 𝐚 → a ) MATHEMATICAL BOLD SMALL A → LATIN SMALL LETTER A # +1D44E ; 0061 ; MA # ( 𝑎 → a ) MATHEMATICAL ITALIC SMALL A → LATIN SMALL LETTER A # +1D482 ; 0061 ; MA # ( 𝒂 → a ) MATHEMATICAL BOLD ITALIC SMALL A → LATIN SMALL LETTER A # +1D4B6 ; 0061 ; MA # ( 𝒶 → a ) MATHEMATICAL SCRIPT SMALL A → LATIN SMALL LETTER A # +1D4EA ; 0061 ; MA # ( 𝓪 → a ) MATHEMATICAL BOLD SCRIPT SMALL A → LATIN SMALL LETTER A # +1D51E ; 0061 ; MA # ( 𝔞 → a ) MATHEMATICAL FRAKTUR SMALL A → LATIN SMALL LETTER A # +1D552 ; 0061 ; MA # ( 𝕒 → a ) MATHEMATICAL DOUBLE-STRUCK SMALL A → LATIN SMALL LETTER A # +1D586 ; 0061 ; MA # ( 𝖆 → a ) MATHEMATICAL BOLD FRAKTUR SMALL A → LATIN SMALL LETTER A # +1D5BA ; 0061 ; MA # ( 𝖺 → a ) MATHEMATICAL SANS-SERIF SMALL A → LATIN SMALL LETTER A # +1D5EE ; 0061 ; MA # ( 𝗮 → a ) MATHEMATICAL SANS-SERIF BOLD SMALL A → LATIN SMALL LETTER A # +1D622 ; 0061 ; MA # ( 𝘢 → a ) MATHEMATICAL SANS-SERIF ITALIC SMALL A → LATIN SMALL LETTER A # +1D656 ; 0061 ; MA # ( 𝙖 → a ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL A → LATIN SMALL LETTER A # +1D68A ; 0061 ; MA # ( 𝚊 → a ) MATHEMATICAL MONOSPACE SMALL A → LATIN SMALL LETTER A # +0251 ; 0061 ; MA # ( ɑ → a ) LATIN SMALL LETTER ALPHA → LATIN SMALL LETTER A # +03B1 ; 0061 ; MA # ( α → a ) GREEK SMALL LETTER ALPHA → LATIN SMALL LETTER A # +1D6C2 ; 0061 ; MA # ( 𝛂 → a ) MATHEMATICAL BOLD SMALL ALPHA → LATIN SMALL LETTER A # →α→ +1D6FC ; 0061 ; MA # ( 𝛼 → a ) MATHEMATICAL ITALIC SMALL ALPHA → LATIN SMALL LETTER A # →α→ +1D736 ; 0061 ; MA # ( 𝜶 → a ) MATHEMATICAL BOLD ITALIC SMALL ALPHA → LATIN SMALL LETTER A # →α→ +1D770 ; 0061 ; MA # ( 𝝰 → a ) MATHEMATICAL SANS-SERIF BOLD SMALL ALPHA → LATIN SMALL LETTER A # →α→ +1D7AA ; 0061 ; MA # ( 𝞪 → a ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL ALPHA → LATIN SMALL LETTER A # →α→ +0430 ; 0061 ; MA # ( а → a ) CYRILLIC SMALL LETTER A → LATIN SMALL LETTER A # + +2DF6 ; 0363 ; MA # ( ⷶ → ͣ ) COMBINING CYRILLIC LETTER A → COMBINING LATIN SMALL LETTER A # + +FF21 ; 0041 ; MA # ( A → A ) FULLWIDTH LATIN CAPITAL LETTER A → LATIN CAPITAL LETTER A # →А→ +1CCD6 ; 0041 ; MA #* ( 𜳖 → A ) OUTLINED LATIN CAPITAL LETTER A → LATIN CAPITAL LETTER A # +1D400 ; 0041 ; MA # ( 𝐀 → A ) MATHEMATICAL BOLD CAPITAL A → LATIN CAPITAL LETTER A # +1D434 ; 0041 ; MA # ( 𝐴 → A ) MATHEMATICAL ITALIC CAPITAL A → LATIN CAPITAL LETTER A # +1D468 ; 0041 ; MA # ( 𝑨 → A ) MATHEMATICAL BOLD ITALIC CAPITAL A → LATIN CAPITAL LETTER A # +1D49C ; 0041 ; MA # ( 𝒜 → A ) MATHEMATICAL SCRIPT CAPITAL A → LATIN CAPITAL LETTER A # +1D4D0 ; 0041 ; MA # ( 𝓐 → A ) MATHEMATICAL BOLD SCRIPT CAPITAL A → LATIN CAPITAL LETTER A # +1D504 ; 0041 ; MA # ( 𝔄 → A ) MATHEMATICAL FRAKTUR CAPITAL A → LATIN CAPITAL LETTER A # +1D538 ; 0041 ; MA # ( 𝔸 → A ) MATHEMATICAL DOUBLE-STRUCK CAPITAL A → LATIN CAPITAL LETTER A # +1D56C ; 0041 ; MA # ( 𝕬 → A ) MATHEMATICAL BOLD FRAKTUR CAPITAL A → LATIN CAPITAL LETTER A # +1D5A0 ; 0041 ; MA # ( 𝖠 → A ) MATHEMATICAL SANS-SERIF CAPITAL A → LATIN CAPITAL LETTER A # +1D5D4 ; 0041 ; MA # ( 𝗔 → A ) MATHEMATICAL SANS-SERIF BOLD CAPITAL A → LATIN CAPITAL LETTER A # +1D608 ; 0041 ; MA # ( 𝘈 → A ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL A → LATIN CAPITAL LETTER A # +1D63C ; 0041 ; MA # ( 𝘼 → A ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL A → LATIN CAPITAL LETTER A # +1D670 ; 0041 ; MA # ( 𝙰 → A ) MATHEMATICAL MONOSPACE CAPITAL A → LATIN CAPITAL LETTER A # +0391 ; 0041 ; MA # ( Α → A ) GREEK CAPITAL LETTER ALPHA → LATIN CAPITAL LETTER A # +1D6A8 ; 0041 ; MA # ( 𝚨 → A ) MATHEMATICAL BOLD CAPITAL ALPHA → LATIN CAPITAL LETTER A # →𝐀→ +1D6E2 ; 0041 ; MA # ( 𝛢 → A ) MATHEMATICAL ITALIC CAPITAL ALPHA → LATIN CAPITAL LETTER A # →Α→ +1D71C ; 0041 ; MA # ( 𝜜 → A ) MATHEMATICAL BOLD ITALIC CAPITAL ALPHA → LATIN CAPITAL LETTER A # →Α→ +1D756 ; 0041 ; MA # ( 𝝖 → A ) MATHEMATICAL SANS-SERIF BOLD CAPITAL ALPHA → LATIN CAPITAL LETTER A # →Α→ +1D790 ; 0041 ; MA # ( 𝞐 → A ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL ALPHA → LATIN CAPITAL LETTER A # →Α→ +0410 ; 0041 ; MA # ( А → A ) CYRILLIC CAPITAL LETTER A → LATIN CAPITAL LETTER A # +13AA ; 0041 ; MA # ( Ꭺ → A ) CHEROKEE LETTER GO → LATIN CAPITAL LETTER A # +15C5 ; 0041 ; MA # ( ᗅ → A ) CANADIAN SYLLABICS CARRIER GHO → LATIN CAPITAL LETTER A # +A4EE ; 0041 ; MA # ( ꓮ → A ) LISU LETTER A → LATIN CAPITAL LETTER A # +16F40 ; 0041 ; MA # ( 𖽀 → A ) MIAO LETTER ZZYA → LATIN CAPITAL LETTER A # +102A0 ; 0041 ; MA # ( 𐊠 → A ) CARIAN LETTER A → LATIN CAPITAL LETTER A # + +2376 ; 0061 0332 ; MA #* ( ⍶ → a̲ ) APL FUNCTIONAL SYMBOL ALPHA UNDERBAR → LATIN SMALL LETTER A, COMBINING LOW LINE # →α̲→→ɑ̲→ + +01CE ; 0103 ; MA # ( ǎ → ă ) LATIN SMALL LETTER A WITH CARON → LATIN SMALL LETTER A WITH BREVE # + +01CD ; 0102 ; MA # ( Ǎ → Ă ) LATIN CAPITAL LETTER A WITH CARON → LATIN CAPITAL LETTER A WITH BREVE # + +0227 ; 00E5 ; MA # ( ȧ → å ) LATIN SMALL LETTER A WITH DOT ABOVE → LATIN SMALL LETTER A WITH RING ABOVE # + +0226 ; 00C5 ; MA # ( Ȧ → Å ) LATIN CAPITAL LETTER A WITH DOT ABOVE → LATIN CAPITAL LETTER A WITH RING ABOVE # + +1E9A ; 1EA3 ; MA # ( ẚ → ả ) LATIN SMALL LETTER A WITH RIGHT HALF RING → LATIN SMALL LETTER A WITH HOOK ABOVE # + +2100 ; 0061 002F 0063 ; MA #* ( ℀ → a/c ) ACCOUNT OF → LATIN SMALL LETTER A, SOLIDUS, LATIN SMALL LETTER C # + +2101 ; 0061 002F 0073 ; MA #* ( ℁ → a/s ) ADDRESSED TO THE SUBJECT → LATIN SMALL LETTER A, SOLIDUS, LATIN SMALL LETTER S # + +A733 ; 0061 0061 ; MA # ( ꜳ → aa ) LATIN SMALL LETTER AA → LATIN SMALL LETTER A, LATIN SMALL LETTER A # + +A732 ; 0041 0041 ; MA # ( Ꜳ → AA ) LATIN CAPITAL LETTER AA → LATIN CAPITAL LETTER A, LATIN CAPITAL LETTER A # + +00E6 ; 0061 0065 ; MA # ( æ → ae ) LATIN SMALL LETTER AE → LATIN SMALL LETTER A, LATIN SMALL LETTER E # +04D5 ; 0061 0065 ; MA # ( ӕ → ae ) CYRILLIC SMALL LIGATURE A IE → LATIN SMALL LETTER A, LATIN SMALL LETTER E # →ае→ + +00C6 ; 0041 0045 ; MA # ( Æ → AE ) LATIN CAPITAL LETTER AE → LATIN CAPITAL LETTER A, LATIN CAPITAL LETTER E # +04D4 ; 0041 0045 ; MA # ( Ӕ → AE ) CYRILLIC CAPITAL LIGATURE A IE → LATIN CAPITAL LETTER A, LATIN CAPITAL LETTER E # →Æ→ + +A735 ; 0061 006F ; MA # ( ꜵ → ao ) LATIN SMALL LETTER AO → LATIN SMALL LETTER A, LATIN SMALL LETTER O # + +A734 ; 0041 004F ; MA # ( Ꜵ → AO ) LATIN CAPITAL LETTER AO → LATIN CAPITAL LETTER A, LATIN CAPITAL LETTER O # + +1F707 ; 0041 0052 ; MA #* ( 🜇 → AR ) ALCHEMICAL SYMBOL FOR AQUA REGIA-2 → LATIN CAPITAL LETTER A, LATIN CAPITAL LETTER R # + +A737 ; 0061 0075 ; MA # ( ꜷ → au ) LATIN SMALL LETTER AU → LATIN SMALL LETTER A, LATIN SMALL LETTER U # + +A736 ; 0041 0055 ; MA # ( Ꜷ → AU ) LATIN CAPITAL LETTER AU → LATIN CAPITAL LETTER A, LATIN CAPITAL LETTER U # + +A739 ; 0061 0076 ; MA # ( ꜹ → av ) LATIN SMALL LETTER AV → LATIN SMALL LETTER A, LATIN SMALL LETTER V # +A73B ; 0061 0076 ; MA # ( ꜻ → av ) LATIN SMALL LETTER AV WITH HORIZONTAL BAR → LATIN SMALL LETTER A, LATIN SMALL LETTER V # + +A738 ; 0041 0056 ; MA # ( Ꜹ → AV ) LATIN CAPITAL LETTER AV → LATIN CAPITAL LETTER A, LATIN CAPITAL LETTER V # +A73A ; 0041 0056 ; MA # ( Ꜻ → AV ) LATIN CAPITAL LETTER AV WITH HORIZONTAL BAR → LATIN CAPITAL LETTER A, LATIN CAPITAL LETTER V # + +A73D ; 0061 0079 ; MA # ( ꜽ → ay ) LATIN SMALL LETTER AY → LATIN SMALL LETTER A, LATIN SMALL LETTER Y # + +A73C ; 0041 0059 ; MA # ( Ꜽ → AY ) LATIN CAPITAL LETTER AY → LATIN CAPITAL LETTER A, LATIN CAPITAL LETTER Y # + +AB7A ; 1D00 ; MA # ( ꭺ → ᴀ ) CHEROKEE SMALL LETTER GO → LATIN LETTER SMALL CAPITAL A # + +2200 ; 2C6F ; MA #* ( ∀ → Ɐ ) FOR ALL → LATIN CAPITAL LETTER TURNED A # +1D217 ; 2C6F ; MA #* ( 𝈗 → Ɐ ) GREEK VOCAL NOTATION SYMBOL-24 → LATIN CAPITAL LETTER TURNED A # +15C4 ; 2C6F ; MA # ( ᗄ → Ɐ ) CANADIAN SYLLABICS CARRIER GHU → LATIN CAPITAL LETTER TURNED A # →∀→ +A4EF ; 2C6F ; MA # ( ꓯ → Ɐ ) LISU LETTER AE → LATIN CAPITAL LETTER TURNED A # + +1041F ; 2C70 ; MA # ( 𐐟 → Ɒ ) DESERET CAPITAL LETTER ESH → LATIN CAPITAL LETTER TURNED ALPHA # + +1D41B ; 0062 ; MA # ( 𝐛 → b ) MATHEMATICAL BOLD SMALL B → LATIN SMALL LETTER B # +1D44F ; 0062 ; MA # ( 𝑏 → b ) MATHEMATICAL ITALIC SMALL B → LATIN SMALL LETTER B # +1D483 ; 0062 ; MA # ( 𝒃 → b ) MATHEMATICAL BOLD ITALIC SMALL B → LATIN SMALL LETTER B # +1D4B7 ; 0062 ; MA # ( 𝒷 → b ) MATHEMATICAL SCRIPT SMALL B → LATIN SMALL LETTER B # +1D4EB ; 0062 ; MA # ( 𝓫 → b ) MATHEMATICAL BOLD SCRIPT SMALL B → LATIN SMALL LETTER B # +1D51F ; 0062 ; MA # ( 𝔟 → b ) MATHEMATICAL FRAKTUR SMALL B → LATIN SMALL LETTER B # +1D553 ; 0062 ; MA # ( 𝕓 → b ) MATHEMATICAL DOUBLE-STRUCK SMALL B → LATIN SMALL LETTER B # +1D587 ; 0062 ; MA # ( 𝖇 → b ) MATHEMATICAL BOLD FRAKTUR SMALL B → LATIN SMALL LETTER B # +1D5BB ; 0062 ; MA # ( 𝖻 → b ) MATHEMATICAL SANS-SERIF SMALL B → LATIN SMALL LETTER B # +1D5EF ; 0062 ; MA # ( 𝗯 → b ) MATHEMATICAL SANS-SERIF BOLD SMALL B → LATIN SMALL LETTER B # +1D623 ; 0062 ; MA # ( 𝘣 → b ) MATHEMATICAL SANS-SERIF ITALIC SMALL B → LATIN SMALL LETTER B # +1D657 ; 0062 ; MA # ( 𝙗 → b ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL B → LATIN SMALL LETTER B # +1D68B ; 0062 ; MA # ( 𝚋 → b ) MATHEMATICAL MONOSPACE SMALL B → LATIN SMALL LETTER B # +0184 ; 0062 ; MA # ( Ƅ → b ) LATIN CAPITAL LETTER TONE SIX → LATIN SMALL LETTER B # +042C ; 0062 ; MA # ( Ь → b ) CYRILLIC CAPITAL LETTER SOFT SIGN → LATIN SMALL LETTER B # →Ƅ→ +13CF ; 0062 ; MA # ( Ꮟ → b ) CHEROKEE LETTER SI → LATIN SMALL LETTER B # +1472 ; 0062 ; MA # ( ᑲ → b ) CANADIAN SYLLABICS KA → LATIN SMALL LETTER B # +15AF ; 0062 ; MA # ( ᖯ → b ) CANADIAN SYLLABICS AIVILIK B → LATIN SMALL LETTER B # +16EB6 ; 0062 ; MA # ( 𖺶 → b ) BERIA ERFE CAPITAL LETTER UI → LATIN SMALL LETTER B # →Ь→→Ƅ→ + +FF22 ; 0042 ; MA # ( B → B ) FULLWIDTH LATIN CAPITAL LETTER B → LATIN CAPITAL LETTER B # →Β→ +212C ; 0042 ; MA # ( ℬ → B ) SCRIPT CAPITAL B → LATIN CAPITAL LETTER B # +1CCD7 ; 0042 ; MA #* ( 𜳗 → B ) OUTLINED LATIN CAPITAL LETTER B → LATIN CAPITAL LETTER B # +1D401 ; 0042 ; MA # ( 𝐁 → B ) MATHEMATICAL BOLD CAPITAL B → LATIN CAPITAL LETTER B # +1D435 ; 0042 ; MA # ( 𝐵 → B ) MATHEMATICAL ITALIC CAPITAL B → LATIN CAPITAL LETTER B # +1D469 ; 0042 ; MA # ( 𝑩 → B ) MATHEMATICAL BOLD ITALIC CAPITAL B → LATIN CAPITAL LETTER B # +1D4D1 ; 0042 ; MA # ( 𝓑 → B ) MATHEMATICAL BOLD SCRIPT CAPITAL B → LATIN CAPITAL LETTER B # +1D505 ; 0042 ; MA # ( 𝔅 → B ) MATHEMATICAL FRAKTUR CAPITAL B → LATIN CAPITAL LETTER B # +1D539 ; 0042 ; MA # ( 𝔹 → B ) MATHEMATICAL DOUBLE-STRUCK CAPITAL B → LATIN CAPITAL LETTER B # +1D56D ; 0042 ; MA # ( 𝕭 → B ) MATHEMATICAL BOLD FRAKTUR CAPITAL B → LATIN CAPITAL LETTER B # +1D5A1 ; 0042 ; MA # ( 𝖡 → B ) MATHEMATICAL SANS-SERIF CAPITAL B → LATIN CAPITAL LETTER B # +1D5D5 ; 0042 ; MA # ( 𝗕 → B ) MATHEMATICAL SANS-SERIF BOLD CAPITAL B → LATIN CAPITAL LETTER B # +1D609 ; 0042 ; MA # ( 𝘉 → B ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL B → LATIN CAPITAL LETTER B # +1D63D ; 0042 ; MA # ( 𝘽 → B ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL B → LATIN CAPITAL LETTER B # +1D671 ; 0042 ; MA # ( 𝙱 → B ) MATHEMATICAL MONOSPACE CAPITAL B → LATIN CAPITAL LETTER B # +A7B4 ; 0042 ; MA # ( Ꞵ → B ) LATIN CAPITAL LETTER BETA → LATIN CAPITAL LETTER B # +0392 ; 0042 ; MA # ( Β → B ) GREEK CAPITAL LETTER BETA → LATIN CAPITAL LETTER B # +1D6A9 ; 0042 ; MA # ( 𝚩 → B ) MATHEMATICAL BOLD CAPITAL BETA → LATIN CAPITAL LETTER B # →Β→ +1D6E3 ; 0042 ; MA # ( 𝛣 → B ) MATHEMATICAL ITALIC CAPITAL BETA → LATIN CAPITAL LETTER B # →Β→ +1D71D ; 0042 ; MA # ( 𝜝 → B ) MATHEMATICAL BOLD ITALIC CAPITAL BETA → LATIN CAPITAL LETTER B # →Β→ +1D757 ; 0042 ; MA # ( 𝝗 → B ) MATHEMATICAL SANS-SERIF BOLD CAPITAL BETA → LATIN CAPITAL LETTER B # →Β→ +1D791 ; 0042 ; MA # ( 𝞑 → B ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL BETA → LATIN CAPITAL LETTER B # →Β→ +2C82 ; 0042 ; MA # ( Ⲃ → B ) COPTIC CAPITAL LETTER VIDA → LATIN CAPITAL LETTER B # +0412 ; 0042 ; MA # ( В → B ) CYRILLIC CAPITAL LETTER VE → LATIN CAPITAL LETTER B # +13F4 ; 0042 ; MA # ( Ᏼ → B ) CHEROKEE LETTER YV → LATIN CAPITAL LETTER B # +15F7 ; 0042 ; MA # ( ᗷ → B ) CANADIAN SYLLABICS CARRIER KHE → LATIN CAPITAL LETTER B # +A4D0 ; 0042 ; MA # ( ꓐ → B ) LISU LETTER BA → LATIN CAPITAL LETTER B # +10282 ; 0042 ; MA # ( 𐊂 → B ) LYCIAN LETTER B → LATIN CAPITAL LETTER B # +102A1 ; 0042 ; MA # ( 𐊡 → B ) CARIAN LETTER P2 → LATIN CAPITAL LETTER B # +10301 ; 0042 ; MA # ( 𐌁 → B ) OLD ITALIC LETTER BE → LATIN CAPITAL LETTER B # + +0253 ; 0062 0314 ; MA # ( ɓ → b̔ ) LATIN SMALL LETTER B WITH HOOK → LATIN SMALL LETTER B, COMBINING REVERSED COMMA ABOVE # + +1473 ; 0062 0307 ; MA # ( ᑳ → ḃ ) CANADIAN SYLLABICS KAA → LATIN SMALL LETTER B, COMBINING DOT ABOVE # + +0183 ; 0062 0304 ; MA # ( ƃ → b̄ ) LATIN SMALL LETTER B WITH TOPBAR → LATIN SMALL LETTER B, COMBINING MACRON # +0182 ; 0062 0304 ; MA # ( Ƃ → b̄ ) LATIN CAPITAL LETTER B WITH TOPBAR → LATIN SMALL LETTER B, COMBINING MACRON # +0411 ; 0062 0304 ; MA # ( Б → b̄ ) CYRILLIC CAPITAL LETTER BE → LATIN SMALL LETTER B, COMBINING MACRON # →Ƃ→ + +0180 ; 0062 0335 ; MA # ( ƀ → b̵ ) LATIN SMALL LETTER B WITH STROKE → LATIN SMALL LETTER B, COMBINING SHORT STROKE OVERLAY # +048D ; 0062 0335 ; MA # ( ҍ → b̵ ) CYRILLIC SMALL LETTER SEMISOFT SIGN → LATIN SMALL LETTER B, COMBINING SHORT STROKE OVERLAY # →ѣ→→Ь̵→ +048C ; 0062 0335 ; MA # ( Ҍ → b̵ ) CYRILLIC CAPITAL LETTER SEMISOFT SIGN → LATIN SMALL LETTER B, COMBINING SHORT STROKE OVERLAY # →Ѣ→→Ь̵→ +0463 ; 0062 0335 ; MA # ( ѣ → b̵ ) CYRILLIC SMALL LETTER YAT → LATIN SMALL LETTER B, COMBINING SHORT STROKE OVERLAY # →Ь̵→ +0462 ; 0062 0335 ; MA # ( Ѣ → b̵ ) CYRILLIC CAPITAL LETTER YAT → LATIN SMALL LETTER B, COMBINING SHORT STROKE OVERLAY # →Ь̵→ + +147F ; 0062 00B7 ; MA # ( ᑿ → b· ) CANADIAN SYLLABICS WEST-CREE KWA → LATIN SMALL LETTER B, MIDDLE DOT # →ᑲᐧ→ + +1481 ; 0062 0307 00B7 ; MA # ( ᒁ → ḃ· ) CANADIAN SYLLABICS WEST-CREE KWAA → LATIN SMALL LETTER B, COMBINING DOT ABOVE, MIDDLE DOT # →ᑳᐧ→ + +1488 ; 0062 0027 ; MA # ( ᒈ → b' ) CANADIAN SYLLABICS SOUTH-SLAVEY KAH → LATIN SMALL LETTER B, APOSTROPHE # →ᑲᑊ→ + +042B ; 0062 006C ; MA # ( Ы → bl ) CYRILLIC CAPITAL LETTER YERU → LATIN SMALL LETTER B, LATIN SMALL LETTER L # →ЬІ→→Ь1→ + +2C83 ; 0299 ; MA # ( ⲃ → ʙ ) COPTIC SMALL LETTER VIDA → LATIN LETTER SMALL CAPITAL B # →в→ +0432 ; 0299 ; MA # ( в → ʙ ) CYRILLIC SMALL LETTER VE → LATIN LETTER SMALL CAPITAL B # +13FC ; 0299 ; MA # ( ᏼ → ʙ ) CHEROKEE SMALL LETTER YV → LATIN LETTER SMALL CAPITAL B # + +FF43 ; 0063 ; MA # ( c → c ) FULLWIDTH LATIN SMALL LETTER C → LATIN SMALL LETTER C # →с→ +217D ; 0063 ; MA # ( ⅽ → c ) SMALL ROMAN NUMERAL ONE HUNDRED → LATIN SMALL LETTER C # +1D41C ; 0063 ; MA # ( 𝐜 → c ) MATHEMATICAL BOLD SMALL C → LATIN SMALL LETTER C # +1D450 ; 0063 ; MA # ( 𝑐 → c ) MATHEMATICAL ITALIC SMALL C → LATIN SMALL LETTER C # +1D484 ; 0063 ; MA # ( 𝒄 → c ) MATHEMATICAL BOLD ITALIC SMALL C → LATIN SMALL LETTER C # +1D4B8 ; 0063 ; MA # ( 𝒸 → c ) MATHEMATICAL SCRIPT SMALL C → LATIN SMALL LETTER C # +1D4EC ; 0063 ; MA # ( 𝓬 → c ) MATHEMATICAL BOLD SCRIPT SMALL C → LATIN SMALL LETTER C # +1D520 ; 0063 ; MA # ( 𝔠 → c ) MATHEMATICAL FRAKTUR SMALL C → LATIN SMALL LETTER C # +1D554 ; 0063 ; MA # ( 𝕔 → c ) MATHEMATICAL DOUBLE-STRUCK SMALL C → LATIN SMALL LETTER C # +1D588 ; 0063 ; MA # ( 𝖈 → c ) MATHEMATICAL BOLD FRAKTUR SMALL C → LATIN SMALL LETTER C # +1D5BC ; 0063 ; MA # ( 𝖼 → c ) MATHEMATICAL SANS-SERIF SMALL C → LATIN SMALL LETTER C # +1D5F0 ; 0063 ; MA # ( 𝗰 → c ) MATHEMATICAL SANS-SERIF BOLD SMALL C → LATIN SMALL LETTER C # +1D624 ; 0063 ; MA # ( 𝘤 → c ) MATHEMATICAL SANS-SERIF ITALIC SMALL C → LATIN SMALL LETTER C # +1D658 ; 0063 ; MA # ( 𝙘 → c ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL C → LATIN SMALL LETTER C # +1D68C ; 0063 ; MA # ( 𝚌 → c ) MATHEMATICAL MONOSPACE SMALL C → LATIN SMALL LETTER C # +1D04 ; 0063 ; MA # ( ᴄ → c ) LATIN LETTER SMALL CAPITAL C → LATIN SMALL LETTER C # +03F2 ; 0063 ; MA # ( ϲ → c ) GREEK LUNATE SIGMA SYMBOL → LATIN SMALL LETTER C # +2CA5 ; 0063 ; MA # ( ⲥ → c ) COPTIC SMALL LETTER SIMA → LATIN SMALL LETTER C # →ϲ→ +0441 ; 0063 ; MA # ( с → c ) CYRILLIC SMALL LETTER ES → LATIN SMALL LETTER C # +1004 ; 0063 ; MA # ( င → c ) MYANMAR LETTER NGA → LATIN SMALL LETTER C # +105A ; 0063 ; MA # ( ၚ → c ) MYANMAR LETTER MON NGA → LATIN SMALL LETTER C # →င→ +ABAF ; 0063 ; MA # ( ꮯ → c ) CHEROKEE SMALL LETTER TLI → LATIN SMALL LETTER C # →ᴄ→ +1043D ; 0063 ; MA # ( 𐐽 → c ) DESERET SMALL LETTER CHEE → LATIN SMALL LETTER C # + +2DED ; 0368 ; MA # ( ⷭ → ͨ ) COMBINING CYRILLIC LETTER ES → COMBINING LATIN SMALL LETTER C # + +1F74C ; 0043 ; MA #* ( 🝌 → C ) ALCHEMICAL SYMBOL FOR CALX → LATIN CAPITAL LETTER C # +118E9 ; 0043 ; MA # ( 𑣩 → C ) WARANG CITI DIGIT NINE → LATIN CAPITAL LETTER C # +118F2 ; 0043 ; MA #* ( 𑣲 → C ) WARANG CITI NUMBER NINETY → LATIN CAPITAL LETTER C # +FF23 ; 0043 ; MA # ( C → C ) FULLWIDTH LATIN CAPITAL LETTER C → LATIN CAPITAL LETTER C # →С→ +216D ; 0043 ; MA # ( Ⅽ → C ) ROMAN NUMERAL ONE HUNDRED → LATIN CAPITAL LETTER C # +2102 ; 0043 ; MA # ( ℂ → C ) DOUBLE-STRUCK CAPITAL C → LATIN CAPITAL LETTER C # +212D ; 0043 ; MA # ( ℭ → C ) BLACK-LETTER CAPITAL C → LATIN CAPITAL LETTER C # +1CCD8 ; 0043 ; MA #* ( 𜳘 → C ) OUTLINED LATIN CAPITAL LETTER C → LATIN CAPITAL LETTER C # +1D402 ; 0043 ; MA # ( 𝐂 → C ) MATHEMATICAL BOLD CAPITAL C → LATIN CAPITAL LETTER C # +1D436 ; 0043 ; MA # ( 𝐶 → C ) MATHEMATICAL ITALIC CAPITAL C → LATIN CAPITAL LETTER C # +1D46A ; 0043 ; MA # ( 𝑪 → C ) MATHEMATICAL BOLD ITALIC CAPITAL C → LATIN CAPITAL LETTER C # +1D49E ; 0043 ; MA # ( 𝒞 → C ) MATHEMATICAL SCRIPT CAPITAL C → LATIN CAPITAL LETTER C # +1D4D2 ; 0043 ; MA # ( 𝓒 → C ) MATHEMATICAL BOLD SCRIPT CAPITAL C → LATIN CAPITAL LETTER C # +1D56E ; 0043 ; MA # ( 𝕮 → C ) MATHEMATICAL BOLD FRAKTUR CAPITAL C → LATIN CAPITAL LETTER C # +1D5A2 ; 0043 ; MA # ( 𝖢 → C ) MATHEMATICAL SANS-SERIF CAPITAL C → LATIN CAPITAL LETTER C # +1D5D6 ; 0043 ; MA # ( 𝗖 → C ) MATHEMATICAL SANS-SERIF BOLD CAPITAL C → LATIN CAPITAL LETTER C # +1D60A ; 0043 ; MA # ( 𝘊 → C ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL C → LATIN CAPITAL LETTER C # +1D63E ; 0043 ; MA # ( 𝘾 → C ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL C → LATIN CAPITAL LETTER C # +1D672 ; 0043 ; MA # ( 𝙲 → C ) MATHEMATICAL MONOSPACE CAPITAL C → LATIN CAPITAL LETTER C # +03F9 ; 0043 ; MA # ( Ϲ → C ) GREEK CAPITAL LUNATE SIGMA SYMBOL → LATIN CAPITAL LETTER C # +2CA4 ; 0043 ; MA # ( Ⲥ → C ) COPTIC CAPITAL LETTER SIMA → LATIN CAPITAL LETTER C # →Ϲ→ +0421 ; 0043 ; MA # ( С → C ) CYRILLIC CAPITAL LETTER ES → LATIN CAPITAL LETTER C # +13DF ; 0043 ; MA # ( Ꮯ → C ) CHEROKEE LETTER TLI → LATIN CAPITAL LETTER C # +A4DA ; 0043 ; MA # ( ꓚ → C ) LISU LETTER CA → LATIN CAPITAL LETTER C # +102A2 ; 0043 ; MA # ( 𐊢 → C ) CARIAN LETTER D → LATIN CAPITAL LETTER C # +10302 ; 0043 ; MA # ( 𐌂 → C ) OLD ITALIC LETTER KE → LATIN CAPITAL LETTER C # +10415 ; 0043 ; MA # ( 𐐕 → C ) DESERET CAPITAL LETTER CHEE → LATIN CAPITAL LETTER C # +1051C ; 0043 ; MA # ( 𐔜 → C ) ELBASAN LETTER SHE → LATIN CAPITAL LETTER C # + +00A2 ; 0063 0338 ; MA #* ( ¢ → c̸ ) CENT SIGN → LATIN SMALL LETTER C, COMBINING LONG SOLIDUS OVERLAY # +023C ; 0063 0338 ; MA # ( ȼ → c̸ ) LATIN SMALL LETTER C WITH STROKE → LATIN SMALL LETTER C, COMBINING LONG SOLIDUS OVERLAY # →¢→ + +20A1 ; 0043 20EB ; MA #* ( ₡ → C⃫ ) COLON SIGN → LATIN CAPITAL LETTER C, COMBINING LONG DOUBLE SOLIDUS OVERLAY # + +1F16E ; 0043 20E0 ; MA #* ( 🅮 → C⃠ ) CIRCLED C WITH OVERLAID BACKSLASH → LATIN CAPITAL LETTER C, COMBINING ENCLOSING CIRCLE BACKSLASH # + +00E7 ; 0063 0326 ; MA # ( ç → c̦ ) LATIN SMALL LETTER C WITH CEDILLA → LATIN SMALL LETTER C, COMBINING COMMA BELOW # →ҫ→→с̡→ +04AB ; 0063 0326 ; MA # ( ҫ → c̦ ) CYRILLIC SMALL LETTER ES WITH DESCENDER → LATIN SMALL LETTER C, COMBINING COMMA BELOW # →с̡→ + +00C7 ; 0043 0326 ; MA # ( Ç → C̦ ) LATIN CAPITAL LETTER C WITH CEDILLA → LATIN CAPITAL LETTER C, COMBINING COMMA BELOW # →Ҫ→→С̡→ +04AA ; 0043 0326 ; MA # ( Ҫ → C̦ ) CYRILLIC CAPITAL LETTER ES WITH DESCENDER → LATIN CAPITAL LETTER C, COMBINING COMMA BELOW # →С̡→ + +0187 ; 0043 0027 ; MA # ( Ƈ → C' ) LATIN CAPITAL LETTER C WITH HOOK → LATIN CAPITAL LETTER C, APOSTROPHE # →Cʽ→ + +2105 ; 0063 002F 006F ; MA #* ( ℅ → c/o ) CARE OF → LATIN SMALL LETTER C, SOLIDUS, LATIN SMALL LETTER O # + +2106 ; 0063 002F 0075 ; MA #* ( ℆ → c/u ) CADA UNA → LATIN SMALL LETTER C, SOLIDUS, LATIN SMALL LETTER U # + +1F16D ; 33C4 0009 20DD ; MA #* ( 🅭 → ) CIRCLED CC → SQUARE CC, , COMBINING ENCLOSING CIRCLE # + +22F4 ; A793 ; MA #* ( ⋴ → ꞓ ) SMALL ELEMENT OF WITH VERTICAL BAR AT END OF HORIZONTAL STROKE → LATIN SMALL LETTER C WITH BAR # →ɛ→→є→ +025B ; A793 ; MA # ( ɛ → ꞓ ) LATIN SMALL LETTER OPEN E → LATIN SMALL LETTER C WITH BAR # →є→ +03B5 ; A793 ; MA # ( ε → ꞓ ) GREEK SMALL LETTER EPSILON → LATIN SMALL LETTER C WITH BAR # →є→ +03F5 ; A793 ; MA # ( ϵ → ꞓ ) GREEK LUNATE EPSILON SYMBOL → LATIN SMALL LETTER C WITH BAR # →ε→→є→ +1D6C6 ; A793 ; MA # ( 𝛆 → ꞓ ) MATHEMATICAL BOLD SMALL EPSILON → LATIN SMALL LETTER C WITH BAR # →ε→→є→ +1D6DC ; A793 ; MA # ( 𝛜 → ꞓ ) MATHEMATICAL BOLD EPSILON SYMBOL → LATIN SMALL LETTER C WITH BAR # →ε→→є→ +1D700 ; A793 ; MA # ( 𝜀 → ꞓ ) MATHEMATICAL ITALIC SMALL EPSILON → LATIN SMALL LETTER C WITH BAR # →ε→→є→ +1D716 ; A793 ; MA # ( 𝜖 → ꞓ ) MATHEMATICAL ITALIC EPSILON SYMBOL → LATIN SMALL LETTER C WITH BAR # →ε→→є→ +1D73A ; A793 ; MA # ( 𝜺 → ꞓ ) MATHEMATICAL BOLD ITALIC SMALL EPSILON → LATIN SMALL LETTER C WITH BAR # →ε→→є→ +1D750 ; A793 ; MA # ( 𝝐 → ꞓ ) MATHEMATICAL BOLD ITALIC EPSILON SYMBOL → LATIN SMALL LETTER C WITH BAR # →ε→→є→ +1D774 ; A793 ; MA # ( 𝝴 → ꞓ ) MATHEMATICAL SANS-SERIF BOLD SMALL EPSILON → LATIN SMALL LETTER C WITH BAR # →ε→→є→ +1D78A ; A793 ; MA # ( 𝞊 → ꞓ ) MATHEMATICAL SANS-SERIF BOLD EPSILON SYMBOL → LATIN SMALL LETTER C WITH BAR # →ε→→є→ +1D7AE ; A793 ; MA # ( 𝞮 → ꞓ ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL EPSILON → LATIN SMALL LETTER C WITH BAR # →ε→→є→ +1D7C4 ; A793 ; MA # ( 𝟄 → ꞓ ) MATHEMATICAL SANS-SERIF BOLD ITALIC EPSILON SYMBOL → LATIN SMALL LETTER C WITH BAR # →ε→→є→ +2C89 ; A793 ; MA # ( ⲉ → ꞓ ) COPTIC SMALL LETTER EIE → LATIN SMALL LETTER C WITH BAR # →є→ +0454 ; A793 ; MA # ( є → ꞓ ) CYRILLIC SMALL LETTER UKRAINIAN IE → LATIN SMALL LETTER C WITH BAR # +0511 ; A793 ; MA # ( ԑ → ꞓ ) CYRILLIC SMALL LETTER REVERSED ZE → LATIN SMALL LETTER C WITH BAR # →ε→→є→ +AB9B ; A793 ; MA # ( ꮛ → ꞓ ) CHEROKEE SMALL LETTER QUV → LATIN SMALL LETTER C WITH BAR # →ɛ→→є→ +118CE ; A793 ; MA # ( 𑣎 → ꞓ ) WARANG CITI SMALL LETTER YUJ → LATIN SMALL LETTER C WITH BAR # →ε→→є→ +10429 ; A793 ; MA # ( 𐐩 → ꞓ ) DESERET SMALL LETTER LONG E → LATIN SMALL LETTER C WITH BAR # →ɛ→→є→ + +20AC ; A792 ; MA #* ( € → Ꞓ ) EURO SIGN → LATIN CAPITAL LETTER C WITH BAR # →Є→ +2C88 ; A792 ; MA # ( Ⲉ → Ꞓ ) COPTIC CAPITAL LETTER EIE → LATIN CAPITAL LETTER C WITH BAR # →Є→ +0404 ; A792 ; MA # ( Є → Ꞓ ) CYRILLIC CAPITAL LETTER UKRAINIAN IE → LATIN CAPITAL LETTER C WITH BAR # + +2377 ; A793 0332 ; MA #* ( ⍷ → ꞓ̲ ) APL FUNCTIONAL SYMBOL EPSILON UNDERBAR → LATIN SMALL LETTER C WITH BAR, COMBINING LOW LINE # →ε̲→ + +037D ; A73F ; MA # ( ͽ → ꜿ ) GREEK SMALL REVERSED DOTTED LUNATE SIGMA SYMBOL → LATIN SMALL LETTER REVERSED C WITH DOT # + +03FF ; A73E ; MA # ( Ͽ → Ꜿ ) GREEK CAPITAL REVERSED DOTTED LUNATE SIGMA SYMBOL → LATIN CAPITAL LETTER REVERSED C WITH DOT # + +217E ; 0064 ; MA # ( ⅾ → d ) SMALL ROMAN NUMERAL FIVE HUNDRED → LATIN SMALL LETTER D # +2146 ; 0064 ; MA # ( ⅆ → d ) DOUBLE-STRUCK ITALIC SMALL D → LATIN SMALL LETTER D # +1D41D ; 0064 ; MA # ( 𝐝 → d ) MATHEMATICAL BOLD SMALL D → LATIN SMALL LETTER D # +1D451 ; 0064 ; MA # ( 𝑑 → d ) MATHEMATICAL ITALIC SMALL D → LATIN SMALL LETTER D # +1D485 ; 0064 ; MA # ( 𝒅 → d ) MATHEMATICAL BOLD ITALIC SMALL D → LATIN SMALL LETTER D # +1D4B9 ; 0064 ; MA # ( 𝒹 → d ) MATHEMATICAL SCRIPT SMALL D → LATIN SMALL LETTER D # +1D4ED ; 0064 ; MA # ( 𝓭 → d ) MATHEMATICAL BOLD SCRIPT SMALL D → LATIN SMALL LETTER D # +1D521 ; 0064 ; MA # ( 𝔡 → d ) MATHEMATICAL FRAKTUR SMALL D → LATIN SMALL LETTER D # +1D555 ; 0064 ; MA # ( 𝕕 → d ) MATHEMATICAL DOUBLE-STRUCK SMALL D → LATIN SMALL LETTER D # +1D589 ; 0064 ; MA # ( 𝖉 → d ) MATHEMATICAL BOLD FRAKTUR SMALL D → LATIN SMALL LETTER D # +1D5BD ; 0064 ; MA # ( 𝖽 → d ) MATHEMATICAL SANS-SERIF SMALL D → LATIN SMALL LETTER D # +1D5F1 ; 0064 ; MA # ( 𝗱 → d ) MATHEMATICAL SANS-SERIF BOLD SMALL D → LATIN SMALL LETTER D # +1D625 ; 0064 ; MA # ( 𝘥 → d ) MATHEMATICAL SANS-SERIF ITALIC SMALL D → LATIN SMALL LETTER D # +1D659 ; 0064 ; MA # ( 𝙙 → d ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL D → LATIN SMALL LETTER D # +1D68D ; 0064 ; MA # ( 𝚍 → d ) MATHEMATICAL MONOSPACE SMALL D → LATIN SMALL LETTER D # +0501 ; 0064 ; MA # ( ԁ → d ) CYRILLIC SMALL LETTER KOMI DE → LATIN SMALL LETTER D # +13E7 ; 0064 ; MA # ( Ꮷ → d ) CHEROKEE LETTER TSU → LATIN SMALL LETTER D # +146F ; 0064 ; MA # ( ᑯ → d ) CANADIAN SYLLABICS KO → LATIN SMALL LETTER D # +A4D2 ; 0064 ; MA # ( ꓒ → d ) LISU LETTER PHA → LATIN SMALL LETTER D # + +216E ; 0044 ; MA # ( Ⅾ → D ) ROMAN NUMERAL FIVE HUNDRED → LATIN CAPITAL LETTER D # +2145 ; 0044 ; MA # ( ⅅ → D ) DOUBLE-STRUCK ITALIC CAPITAL D → LATIN CAPITAL LETTER D # +1CCD9 ; 0044 ; MA #* ( 𜳙 → D ) OUTLINED LATIN CAPITAL LETTER D → LATIN CAPITAL LETTER D # +1D403 ; 0044 ; MA # ( 𝐃 → D ) MATHEMATICAL BOLD CAPITAL D → LATIN CAPITAL LETTER D # +1D437 ; 0044 ; MA # ( 𝐷 → D ) MATHEMATICAL ITALIC CAPITAL D → LATIN CAPITAL LETTER D # +1D46B ; 0044 ; MA # ( 𝑫 → D ) MATHEMATICAL BOLD ITALIC CAPITAL D → LATIN CAPITAL LETTER D # +1D49F ; 0044 ; MA # ( 𝒟 → D ) MATHEMATICAL SCRIPT CAPITAL D → LATIN CAPITAL LETTER D # +1D4D3 ; 0044 ; MA # ( 𝓓 → D ) MATHEMATICAL BOLD SCRIPT CAPITAL D → LATIN CAPITAL LETTER D # +1D507 ; 0044 ; MA # ( 𝔇 → D ) MATHEMATICAL FRAKTUR CAPITAL D → LATIN CAPITAL LETTER D # +1D53B ; 0044 ; MA # ( 𝔻 → D ) MATHEMATICAL DOUBLE-STRUCK CAPITAL D → LATIN CAPITAL LETTER D # +1D56F ; 0044 ; MA # ( 𝕯 → D ) MATHEMATICAL BOLD FRAKTUR CAPITAL D → LATIN CAPITAL LETTER D # +1D5A3 ; 0044 ; MA # ( 𝖣 → D ) MATHEMATICAL SANS-SERIF CAPITAL D → LATIN CAPITAL LETTER D # +1D5D7 ; 0044 ; MA # ( 𝗗 → D ) MATHEMATICAL SANS-SERIF BOLD CAPITAL D → LATIN CAPITAL LETTER D # +1D60B ; 0044 ; MA # ( 𝘋 → D ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL D → LATIN CAPITAL LETTER D # +1D63F ; 0044 ; MA # ( 𝘿 → D ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL D → LATIN CAPITAL LETTER D # +1D673 ; 0044 ; MA # ( 𝙳 → D ) MATHEMATICAL MONOSPACE CAPITAL D → LATIN CAPITAL LETTER D # +13A0 ; 0044 ; MA # ( Ꭰ → D ) CHEROKEE LETTER A → LATIN CAPITAL LETTER D # +15DE ; 0044 ; MA # ( ᗞ → D ) CANADIAN SYLLABICS CARRIER THE → LATIN CAPITAL LETTER D # +15EA ; 0044 ; MA # ( ᗪ → D ) CANADIAN SYLLABICS CARRIER PE → LATIN CAPITAL LETTER D # →ᗞ→ +A4D3 ; 0044 ; MA # ( ꓓ → D ) LISU LETTER DA → LATIN CAPITAL LETTER D # + +0257 ; 0064 0314 ; MA # ( ɗ → d̔ ) LATIN SMALL LETTER D WITH HOOK → LATIN SMALL LETTER D, COMBINING REVERSED COMMA ABOVE # + +0256 ; 0064 0328 ; MA # ( ɖ → d̨ ) LATIN SMALL LETTER D WITH TAIL → LATIN SMALL LETTER D, COMBINING OGONEK # →d̢→ + +018C ; 0064 0304 ; MA # ( ƌ → d̄ ) LATIN SMALL LETTER D WITH TOPBAR → LATIN SMALL LETTER D, COMBINING MACRON # + +0111 ; 0064 0335 ; MA # ( đ → d̵ ) LATIN SMALL LETTER D WITH STROKE → LATIN SMALL LETTER D, COMBINING SHORT STROKE OVERLAY # + +0110 ; 0044 0335 ; MA # ( Đ → D̵ ) LATIN CAPITAL LETTER D WITH STROKE → LATIN CAPITAL LETTER D, COMBINING SHORT STROKE OVERLAY # +00D0 ; 0044 0335 ; MA # ( Ð → D̵ ) LATIN CAPITAL LETTER ETH → LATIN CAPITAL LETTER D, COMBINING SHORT STROKE OVERLAY # →Đ→ +0189 ; 0044 0335 ; MA # ( Ɖ → D̵ ) LATIN CAPITAL LETTER AFRICAN D → LATIN CAPITAL LETTER D, COMBINING SHORT STROKE OVERLAY # →Đ→ + +20AB ; 0064 0335 0331 ; MA #* ( ₫ → ḏ̵ ) DONG SIGN → LATIN SMALL LETTER D, COMBINING SHORT STROKE OVERLAY, COMBINING MACRON BELOW # →đ̱→ + +A77A ; A779 ; MA # ( ꝺ → Ꝺ ) LATIN SMALL LETTER INSULAR D → LATIN CAPITAL LETTER INSULAR D # + +147B ; 0064 00B7 ; MA # ( ᑻ → d· ) CANADIAN SYLLABICS WEST-CREE KWO → LATIN SMALL LETTER D, MIDDLE DOT # →ᑯᐧ→ + +1487 ; 0064 0027 ; MA # ( ᒇ → d' ) CANADIAN SYLLABICS SOUTH-SLAVEY KOH → LATIN SMALL LETTER D, APOSTROPHE # →ᑯᑊ→ + +02A4 ; 0064 021D ; MA # ( ʤ → dȝ ) LATIN SMALL LETTER DEZH DIGRAPH → LATIN SMALL LETTER D, LATIN SMALL LETTER YOGH # →dʒ→ + +01F3 ; 0064 007A ; MA # ( dz → dz ) LATIN SMALL LETTER DZ → LATIN SMALL LETTER D, LATIN SMALL LETTER Z # +02A3 ; 0064 007A ; MA # ( ʣ → dz ) LATIN SMALL LETTER DZ DIGRAPH → LATIN SMALL LETTER D, LATIN SMALL LETTER Z # + +01F2 ; 0044 007A ; MA # ( Dz → Dz ) LATIN CAPITAL LETTER D WITH SMALL LETTER Z → LATIN CAPITAL LETTER D, LATIN SMALL LETTER Z # + +01F1 ; 0044 005A ; MA # ( DZ → DZ ) LATIN CAPITAL LETTER DZ → LATIN CAPITAL LETTER D, LATIN CAPITAL LETTER Z # + +01C6 ; 0064 017E ; MA # ( dž → dž ) LATIN SMALL LETTER DZ WITH CARON → LATIN SMALL LETTER D, LATIN SMALL LETTER Z WITH CARON # + +01C5 ; 0044 017E ; MA # ( Dž → Dž ) LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON → LATIN CAPITAL LETTER D, LATIN SMALL LETTER Z WITH CARON # + +01C4 ; 0044 017D ; MA # ( DŽ → DŽ ) LATIN CAPITAL LETTER DZ WITH CARON → LATIN CAPITAL LETTER D, LATIN CAPITAL LETTER Z WITH CARON # + +02A5 ; 0064 0291 ; MA # ( ʥ → dʑ ) LATIN SMALL LETTER DZ DIGRAPH WITH CURL → LATIN SMALL LETTER D, LATIN SMALL LETTER Z WITH CURL # + +AB70 ; 1D05 ; MA # ( ꭰ → ᴅ ) CHEROKEE SMALL LETTER A → LATIN LETTER SMALL CAPITAL D # + +2E39 ; 1E9F ; MA #* ( ⸹ → ẟ ) TOP HALF SECTION SIGN → LATIN SMALL LETTER DELTA # →δ→ +03B4 ; 1E9F ; MA # ( δ → ẟ ) GREEK SMALL LETTER DELTA → LATIN SMALL LETTER DELTA # +1D6C5 ; 1E9F ; MA # ( 𝛅 → ẟ ) MATHEMATICAL BOLD SMALL DELTA → LATIN SMALL LETTER DELTA # →δ→ +1D6FF ; 1E9F ; MA # ( 𝛿 → ẟ ) MATHEMATICAL ITALIC SMALL DELTA → LATIN SMALL LETTER DELTA # →δ→ +1D739 ; 1E9F ; MA # ( 𝜹 → ẟ ) MATHEMATICAL BOLD ITALIC SMALL DELTA → LATIN SMALL LETTER DELTA # →δ→ +1D773 ; 1E9F ; MA # ( 𝝳 → ẟ ) MATHEMATICAL SANS-SERIF BOLD SMALL DELTA → LATIN SMALL LETTER DELTA # →δ→ +1D7AD ; 1E9F ; MA # ( 𝞭 → ẟ ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL DELTA → LATIN SMALL LETTER DELTA # →δ→ +2CDD ; 1E9F ; MA # ( ⳝ → ẟ ) COPTIC SMALL LETTER OLD NUBIAN SHIMA → LATIN SMALL LETTER DELTA # →δ→ +056E ; 1E9F ; MA # ( ծ → ẟ ) ARMENIAN SMALL LETTER CA → LATIN SMALL LETTER DELTA # →δ→ +1577 ; 1E9F ; MA # ( ᕷ → ẟ ) CANADIAN SYLLABICS NUNAVIK HO → LATIN SMALL LETTER DELTA # →δ→ + +212E ; 0065 ; MA # ( ℮ → e ) ESTIMATED SYMBOL → LATIN SMALL LETTER E # +FF45 ; 0065 ; MA # ( e → e ) FULLWIDTH LATIN SMALL LETTER E → LATIN SMALL LETTER E # →е→ +212F ; 0065 ; MA # ( ℯ → e ) SCRIPT SMALL E → LATIN SMALL LETTER E # +2147 ; 0065 ; MA # ( ⅇ → e ) DOUBLE-STRUCK ITALIC SMALL E → LATIN SMALL LETTER E # +1D41E ; 0065 ; MA # ( 𝐞 → e ) MATHEMATICAL BOLD SMALL E → LATIN SMALL LETTER E # +1D452 ; 0065 ; MA # ( 𝑒 → e ) MATHEMATICAL ITALIC SMALL E → LATIN SMALL LETTER E # +1D486 ; 0065 ; MA # ( 𝒆 → e ) MATHEMATICAL BOLD ITALIC SMALL E → LATIN SMALL LETTER E # +1D4EE ; 0065 ; MA # ( 𝓮 → e ) MATHEMATICAL BOLD SCRIPT SMALL E → LATIN SMALL LETTER E # +1D522 ; 0065 ; MA # ( 𝔢 → e ) MATHEMATICAL FRAKTUR SMALL E → LATIN SMALL LETTER E # +1D556 ; 0065 ; MA # ( 𝕖 → e ) MATHEMATICAL DOUBLE-STRUCK SMALL E → LATIN SMALL LETTER E # +1D58A ; 0065 ; MA # ( 𝖊 → e ) MATHEMATICAL BOLD FRAKTUR SMALL E → LATIN SMALL LETTER E # +1D5BE ; 0065 ; MA # ( 𝖾 → e ) MATHEMATICAL SANS-SERIF SMALL E → LATIN SMALL LETTER E # +1D5F2 ; 0065 ; MA # ( 𝗲 → e ) MATHEMATICAL SANS-SERIF BOLD SMALL E → LATIN SMALL LETTER E # +1D626 ; 0065 ; MA # ( 𝘦 → e ) MATHEMATICAL SANS-SERIF ITALIC SMALL E → LATIN SMALL LETTER E # +1D65A ; 0065 ; MA # ( 𝙚 → e ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL E → LATIN SMALL LETTER E # +1D68E ; 0065 ; MA # ( 𝚎 → e ) MATHEMATICAL MONOSPACE SMALL E → LATIN SMALL LETTER E # +AB32 ; 0065 ; MA # ( ꬲ → e ) LATIN SMALL LETTER BLACKLETTER E → LATIN SMALL LETTER E # +0435 ; 0065 ; MA # ( е → e ) CYRILLIC SMALL LETTER IE → LATIN SMALL LETTER E # +04BD ; 0065 ; MA # ( ҽ → e ) CYRILLIC SMALL LETTER ABKHASIAN CHE → LATIN SMALL LETTER E # + +2DF7 ; 0364 ; MA # ( ⷷ → ͤ ) COMBINING CYRILLIC LETTER IE → COMBINING LATIN SMALL LETTER E # + +22FF ; 0045 ; MA #* ( ⋿ → E ) Z NOTATION BAG MEMBERSHIP → LATIN CAPITAL LETTER E # +FF25 ; 0045 ; MA # ( E → E ) FULLWIDTH LATIN CAPITAL LETTER E → LATIN CAPITAL LETTER E # →Ε→ +2130 ; 0045 ; MA # ( ℰ → E ) SCRIPT CAPITAL E → LATIN CAPITAL LETTER E # +1CCDA ; 0045 ; MA #* ( 𜳚 → E ) OUTLINED LATIN CAPITAL LETTER E → LATIN CAPITAL LETTER E # +1D404 ; 0045 ; MA # ( 𝐄 → E ) MATHEMATICAL BOLD CAPITAL E → LATIN CAPITAL LETTER E # +1D438 ; 0045 ; MA # ( 𝐸 → E ) MATHEMATICAL ITALIC CAPITAL E → LATIN CAPITAL LETTER E # +1D46C ; 0045 ; MA # ( 𝑬 → E ) MATHEMATICAL BOLD ITALIC CAPITAL E → LATIN CAPITAL LETTER E # +1D4D4 ; 0045 ; MA # ( 𝓔 → E ) MATHEMATICAL BOLD SCRIPT CAPITAL E → LATIN CAPITAL LETTER E # +1D508 ; 0045 ; MA # ( 𝔈 → E ) MATHEMATICAL FRAKTUR CAPITAL E → LATIN CAPITAL LETTER E # +1D53C ; 0045 ; MA # ( 𝔼 → E ) MATHEMATICAL DOUBLE-STRUCK CAPITAL E → LATIN CAPITAL LETTER E # +1D570 ; 0045 ; MA # ( 𝕰 → E ) MATHEMATICAL BOLD FRAKTUR CAPITAL E → LATIN CAPITAL LETTER E # +1D5A4 ; 0045 ; MA # ( 𝖤 → E ) MATHEMATICAL SANS-SERIF CAPITAL E → LATIN CAPITAL LETTER E # +1D5D8 ; 0045 ; MA # ( 𝗘 → E ) MATHEMATICAL SANS-SERIF BOLD CAPITAL E → LATIN CAPITAL LETTER E # +1D60C ; 0045 ; MA # ( 𝘌 → E ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL E → LATIN CAPITAL LETTER E # +1D640 ; 0045 ; MA # ( 𝙀 → E ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL E → LATIN CAPITAL LETTER E # +1D674 ; 0045 ; MA # ( 𝙴 → E ) MATHEMATICAL MONOSPACE CAPITAL E → LATIN CAPITAL LETTER E # +0395 ; 0045 ; MA # ( Ε → E ) GREEK CAPITAL LETTER EPSILON → LATIN CAPITAL LETTER E # +1D6AC ; 0045 ; MA # ( 𝚬 → E ) MATHEMATICAL BOLD CAPITAL EPSILON → LATIN CAPITAL LETTER E # →𝐄→ +1D6E6 ; 0045 ; MA # ( 𝛦 → E ) MATHEMATICAL ITALIC CAPITAL EPSILON → LATIN CAPITAL LETTER E # →Ε→ +1D720 ; 0045 ; MA # ( 𝜠 → E ) MATHEMATICAL BOLD ITALIC CAPITAL EPSILON → LATIN CAPITAL LETTER E # →Ε→ +1D75A ; 0045 ; MA # ( 𝝚 → E ) MATHEMATICAL SANS-SERIF BOLD CAPITAL EPSILON → LATIN CAPITAL LETTER E # →Ε→ +1D794 ; 0045 ; MA # ( 𝞔 → E ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL EPSILON → LATIN CAPITAL LETTER E # →Ε→ +0415 ; 0045 ; MA # ( Е → E ) CYRILLIC CAPITAL LETTER IE → LATIN CAPITAL LETTER E # +2D39 ; 0045 ; MA # ( ⴹ → E ) TIFINAGH LETTER YADD → LATIN CAPITAL LETTER E # +13AC ; 0045 ; MA # ( Ꭼ → E ) CHEROKEE LETTER GV → LATIN CAPITAL LETTER E # +A4F0 ; 0045 ; MA # ( ꓰ → E ) LISU LETTER E → LATIN CAPITAL LETTER E # +118A6 ; 0045 ; MA # ( 𑢦 → E ) WARANG CITI CAPITAL LETTER II → LATIN CAPITAL LETTER E # +118AE ; 0045 ; MA # ( 𑢮 → E ) WARANG CITI CAPITAL LETTER YUJ → LATIN CAPITAL LETTER E # +10286 ; 0045 ; MA # ( 𐊆 → E ) LYCIAN LETTER I → LATIN CAPITAL LETTER E # + +011B ; 0115 ; MA # ( ě → ĕ ) LATIN SMALL LETTER E WITH CARON → LATIN SMALL LETTER E WITH BREVE # + +011A ; 0114 ; MA # ( Ě → Ĕ ) LATIN CAPITAL LETTER E WITH CARON → LATIN CAPITAL LETTER E WITH BREVE # + +0247 ; 0065 0338 ; MA # ( ɇ → e̸ ) LATIN SMALL LETTER E WITH STROKE → LATIN SMALL LETTER E, COMBINING LONG SOLIDUS OVERLAY # →e̷→ + +0246 ; 0045 0338 ; MA # ( Ɇ → E̸ ) LATIN CAPITAL LETTER E WITH STROKE → LATIN CAPITAL LETTER E, COMBINING LONG SOLIDUS OVERLAY # + +04BF ; 0065 0328 ; MA # ( ҿ → ę ) CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER → LATIN SMALL LETTER E, COMBINING OGONEK # →ҽ̢→ + +AB7C ; 1D07 ; MA # ( ꭼ → ᴇ ) CHEROKEE SMALL LETTER GV → LATIN LETTER SMALL CAPITAL E # + +0259 ; 01DD ; MA # ( ə → ǝ ) LATIN SMALL LETTER SCHWA → LATIN SMALL LETTER TURNED E # +04D9 ; 01DD ; MA # ( ә → ǝ ) CYRILLIC SMALL LETTER SCHWA → LATIN SMALL LETTER TURNED E # + +2203 ; 018E ; MA #* ( ∃ → Ǝ ) THERE EXISTS → LATIN CAPITAL LETTER REVERSED E # +2D3A ; 018E ; MA # ( ⴺ → Ǝ ) TIFINAGH LETTER YADDH → LATIN CAPITAL LETTER REVERSED E # +A4F1 ; 018E ; MA # ( ꓱ → Ǝ ) LISU LETTER EU → LATIN CAPITAL LETTER REVERSED E # + +025A ; 01DD 02DE ; MA # ( ɚ → ǝ˞ ) LATIN SMALL LETTER SCHWA WITH HOOK → LATIN SMALL LETTER TURNED E, MODIFIER LETTER RHOTIC HOOK # →ə˞→ + +1D14 ; 01DD 006F ; MA # ( ᴔ → ǝo ) LATIN SMALL LETTER TURNED OE → LATIN SMALL LETTER TURNED E, LATIN SMALL LETTER O # →əo→ + +AB41 ; 01DD 006F 0338 ; MA # ( ꭁ → ǝo̸ ) LATIN SMALL LETTER TURNED OE WITH STROKE → LATIN SMALL LETTER TURNED E, LATIN SMALL LETTER O, COMBINING LONG SOLIDUS OVERLAY # →ǝø→ + +AB42 ; 01DD 006F 0335 ; MA # ( ꭂ → ǝo̵ ) LATIN SMALL LETTER TURNED OE WITH HORIZONTAL STROKE → LATIN SMALL LETTER TURNED E, LATIN SMALL LETTER O, COMBINING SHORT STROKE OVERLAY # →ǝɵ→ + +04D8 ; 018F ; MA # ( Ә → Ə ) CYRILLIC CAPITAL LETTER SCHWA → LATIN CAPITAL LETTER SCHWA # + +1D221 ; 0190 ; MA #* ( 𝈡 → Ɛ ) GREEK INSTRUMENTAL NOTATION SYMBOL-7 → LATIN CAPITAL LETTER OPEN E # +2107 ; 0190 ; MA # ( ℇ → Ɛ ) EULER CONSTANT → LATIN CAPITAL LETTER OPEN E # +0510 ; 0190 ; MA # ( Ԑ → Ɛ ) CYRILLIC CAPITAL LETTER REVERSED ZE → LATIN CAPITAL LETTER OPEN E # +13CB ; 0190 ; MA # ( Ꮛ → Ɛ ) CHEROKEE LETTER QUV → LATIN CAPITAL LETTER OPEN E # +16F2D ; 0190 ; MA # ( 𖼭 → Ɛ ) MIAO LETTER NYHA → LATIN CAPITAL LETTER OPEN E # +10401 ; 0190 ; MA # ( 𐐁 → Ɛ ) DESERET CAPITAL LETTER LONG E → LATIN CAPITAL LETTER OPEN E # + +1D9F ; 1D4B ; MA # ( ᶟ → ᵋ ) MODIFIER LETTER SMALL REVERSED OPEN E → MODIFIER LETTER SMALL OPEN E # + +1D08 ; 025C ; MA # ( ᴈ → ɜ ) LATIN SMALL LETTER TURNED OPEN E → LATIN SMALL LETTER REVERSED OPEN E # +0437 ; 025C ; MA # ( з → ɜ ) CYRILLIC SMALL LETTER ZE → LATIN SMALL LETTER REVERSED OPEN E # + +0499 ; 025C 0326 ; MA # ( ҙ → ɜ̦ ) CYRILLIC SMALL LETTER ZE WITH DESCENDER → LATIN SMALL LETTER REVERSED OPEN E, COMBINING COMMA BELOW # →з̡→ + +10442 ; 025E ; MA # ( 𐑂 → ɞ ) DESERET SMALL LETTER VEE → LATIN SMALL LETTER CLOSED REVERSED OPEN E # + +A79D ; 029A ; MA # ( ꞝ → ʚ ) LATIN SMALL LETTER VOLAPUK OE → LATIN SMALL LETTER CLOSED OPEN E # +1042A ; 029A ; MA # ( 𐐪 → ʚ ) DESERET SMALL LETTER LONG A → LATIN SMALL LETTER CLOSED OPEN E # + +1D41F ; 0066 ; MA # ( 𝐟 → f ) MATHEMATICAL BOLD SMALL F → LATIN SMALL LETTER F # +1D453 ; 0066 ; MA # ( 𝑓 → f ) MATHEMATICAL ITALIC SMALL F → LATIN SMALL LETTER F # +1D487 ; 0066 ; MA # ( 𝒇 → f ) MATHEMATICAL BOLD ITALIC SMALL F → LATIN SMALL LETTER F # +1D4BB ; 0066 ; MA # ( 𝒻 → f ) MATHEMATICAL SCRIPT SMALL F → LATIN SMALL LETTER F # +1D4EF ; 0066 ; MA # ( 𝓯 → f ) MATHEMATICAL BOLD SCRIPT SMALL F → LATIN SMALL LETTER F # +1D523 ; 0066 ; MA # ( 𝔣 → f ) MATHEMATICAL FRAKTUR SMALL F → LATIN SMALL LETTER F # +1D557 ; 0066 ; MA # ( 𝕗 → f ) MATHEMATICAL DOUBLE-STRUCK SMALL F → LATIN SMALL LETTER F # +1D58B ; 0066 ; MA # ( 𝖋 → f ) MATHEMATICAL BOLD FRAKTUR SMALL F → LATIN SMALL LETTER F # +1D5BF ; 0066 ; MA # ( 𝖿 → f ) MATHEMATICAL SANS-SERIF SMALL F → LATIN SMALL LETTER F # +1D5F3 ; 0066 ; MA # ( 𝗳 → f ) MATHEMATICAL SANS-SERIF BOLD SMALL F → LATIN SMALL LETTER F # +1D627 ; 0066 ; MA # ( 𝘧 → f ) MATHEMATICAL SANS-SERIF ITALIC SMALL F → LATIN SMALL LETTER F # +1D65B ; 0066 ; MA # ( 𝙛 → f ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL F → LATIN SMALL LETTER F # +1D68F ; 0066 ; MA # ( 𝚏 → f ) MATHEMATICAL MONOSPACE SMALL F → LATIN SMALL LETTER F # +AB35 ; 0066 ; MA # ( ꬵ → f ) LATIN SMALL LETTER LENIS F → LATIN SMALL LETTER F # +A799 ; 0066 ; MA # ( ꞙ → f ) LATIN SMALL LETTER F WITH STROKE → LATIN SMALL LETTER F # +0192 ; 0066 ; MA # ( ƒ → f ) LATIN SMALL LETTER F WITH HOOK → LATIN SMALL LETTER F # +017F ; 0066 ; MA # ( ſ → f ) LATIN SMALL LETTER LONG S → LATIN SMALL LETTER F # +1E9D ; 0066 ; MA # ( ẝ → f ) LATIN SMALL LETTER LONG S WITH HIGH STROKE → LATIN SMALL LETTER F # +0584 ; 0066 ; MA # ( ք → f ) ARMENIAN SMALL LETTER KEH → LATIN SMALL LETTER F # + +1D213 ; 0046 ; MA #* ( 𝈓 → F ) GREEK VOCAL NOTATION SYMBOL-20 → LATIN CAPITAL LETTER F # →Ϝ→ +2131 ; 0046 ; MA # ( ℱ → F ) SCRIPT CAPITAL F → LATIN CAPITAL LETTER F # +1CCDB ; 0046 ; MA #* ( 𜳛 → F ) OUTLINED LATIN CAPITAL LETTER F → LATIN CAPITAL LETTER F # +1D405 ; 0046 ; MA # ( 𝐅 → F ) MATHEMATICAL BOLD CAPITAL F → LATIN CAPITAL LETTER F # +1D439 ; 0046 ; MA # ( 𝐹 → F ) MATHEMATICAL ITALIC CAPITAL F → LATIN CAPITAL LETTER F # +1D46D ; 0046 ; MA # ( 𝑭 → F ) MATHEMATICAL BOLD ITALIC CAPITAL F → LATIN CAPITAL LETTER F # +1D4D5 ; 0046 ; MA # ( 𝓕 → F ) MATHEMATICAL BOLD SCRIPT CAPITAL F → LATIN CAPITAL LETTER F # +1D509 ; 0046 ; MA # ( 𝔉 → F ) MATHEMATICAL FRAKTUR CAPITAL F → LATIN CAPITAL LETTER F # +1D53D ; 0046 ; MA # ( 𝔽 → F ) MATHEMATICAL DOUBLE-STRUCK CAPITAL F → LATIN CAPITAL LETTER F # +1D571 ; 0046 ; MA # ( 𝕱 → F ) MATHEMATICAL BOLD FRAKTUR CAPITAL F → LATIN CAPITAL LETTER F # +1D5A5 ; 0046 ; MA # ( 𝖥 → F ) MATHEMATICAL SANS-SERIF CAPITAL F → LATIN CAPITAL LETTER F # +1D5D9 ; 0046 ; MA # ( 𝗙 → F ) MATHEMATICAL SANS-SERIF BOLD CAPITAL F → LATIN CAPITAL LETTER F # +1D60D ; 0046 ; MA # ( 𝘍 → F ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL F → LATIN CAPITAL LETTER F # +1D641 ; 0046 ; MA # ( 𝙁 → F ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL F → LATIN CAPITAL LETTER F # +1D675 ; 0046 ; MA # ( 𝙵 → F ) MATHEMATICAL MONOSPACE CAPITAL F → LATIN CAPITAL LETTER F # +A798 ; 0046 ; MA # ( Ꞙ → F ) LATIN CAPITAL LETTER F WITH STROKE → LATIN CAPITAL LETTER F # +03DC ; 0046 ; MA # ( Ϝ → F ) GREEK LETTER DIGAMMA → LATIN CAPITAL LETTER F # +1D7CA ; 0046 ; MA # ( 𝟊 → F ) MATHEMATICAL BOLD CAPITAL DIGAMMA → LATIN CAPITAL LETTER F # →Ϝ→ +15B4 ; 0046 ; MA # ( ᖴ → F ) CANADIAN SYLLABICS BLACKFOOT WE → LATIN CAPITAL LETTER F # +A4DD ; 0046 ; MA # ( ꓝ → F ) LISU LETTER TSA → LATIN CAPITAL LETTER F # +118C2 ; 0046 ; MA # ( 𑣂 → F ) WARANG CITI SMALL LETTER WI → LATIN CAPITAL LETTER F # +118A2 ; 0046 ; MA # ( 𑢢 → F ) WARANG CITI CAPITAL LETTER WI → LATIN CAPITAL LETTER F # +10287 ; 0046 ; MA # ( 𐊇 → F ) LYCIAN LETTER W → LATIN CAPITAL LETTER F # +102A5 ; 0046 ; MA # ( 𐊥 → F ) CARIAN LETTER R → LATIN CAPITAL LETTER F # +10525 ; 0046 ; MA # ( 𐔥 → F ) ELBASAN LETTER GHE → LATIN CAPITAL LETTER F # + +0191 ; 0046 0326 ; MA # ( Ƒ → F̦ ) LATIN CAPITAL LETTER F WITH HOOK → LATIN CAPITAL LETTER F, COMBINING COMMA BELOW # →F̡→ + +1D6E ; 0066 0334 ; MA # ( ᵮ → f̴ ) LATIN SMALL LETTER F WITH MIDDLE TILDE → LATIN SMALL LETTER F, COMBINING TILDE OVERLAY # + +213B ; 0046 0041 0058 ; MA #* ( ℻ → FAX ) FACSIMILE SIGN → LATIN CAPITAL LETTER F, LATIN CAPITAL LETTER A, LATIN CAPITAL LETTER X # + +FB00 ; 0066 0066 ; MA # ( ff → ff ) LATIN SMALL LIGATURE FF → LATIN SMALL LETTER F, LATIN SMALL LETTER F # + +FB03 ; 0066 0066 0069 ; MA # ( ffi → ffi ) LATIN SMALL LIGATURE FFI → LATIN SMALL LETTER F, LATIN SMALL LETTER F, LATIN SMALL LETTER I # + +FB04 ; 0066 0066 006C ; MA # ( ffl → ffl ) LATIN SMALL LIGATURE FFL → LATIN SMALL LETTER F, LATIN SMALL LETTER F, LATIN SMALL LETTER L # + +FB01 ; 0066 0069 ; MA # ( fi → fi ) LATIN SMALL LIGATURE FI → LATIN SMALL LETTER F, LATIN SMALL LETTER I # + +FB02 ; 0066 006C ; MA # ( fl → fl ) LATIN SMALL LIGATURE FL → LATIN SMALL LETTER F, LATIN SMALL LETTER L # + +02A9 ; 0066 006E 0329 ; MA # ( ʩ → fn̩ ) LATIN SMALL LETTER FENG DIGRAPH → LATIN SMALL LETTER F, LATIN SMALL LETTER N, COMBINING VERTICAL LINE BELOW # →fŋ→ + +15B5 ; 2132 ; MA # ( ᖵ → Ⅎ ) CANADIAN SYLLABICS BLACKFOOT WI → TURNED CAPITAL F # +A4DE ; 2132 ; MA # ( ꓞ → Ⅎ ) LISU LETTER TSHA → TURNED CAPITAL F # + +1D230 ; A7FB ; MA #* ( 𝈰 → ꟻ ) GREEK INSTRUMENTAL NOTATION SYMBOL-30 → LATIN EPIGRAPHIC LETTER REVERSED F # +15B7 ; A7FB ; MA # ( ᖷ → ꟻ ) CANADIAN SYLLABICS BLACKFOOT WA → LATIN EPIGRAPHIC LETTER REVERSED F # + +FF47 ; 0067 ; MA # ( g → g ) FULLWIDTH LATIN SMALL LETTER G → LATIN SMALL LETTER G # →ɡ→ +210A ; 0067 ; MA # ( ℊ → g ) SCRIPT SMALL G → LATIN SMALL LETTER G # +1D420 ; 0067 ; MA # ( 𝐠 → g ) MATHEMATICAL BOLD SMALL G → LATIN SMALL LETTER G # +1D454 ; 0067 ; MA # ( 𝑔 → g ) MATHEMATICAL ITALIC SMALL G → LATIN SMALL LETTER G # +1D488 ; 0067 ; MA # ( 𝒈 → g ) MATHEMATICAL BOLD ITALIC SMALL G → LATIN SMALL LETTER G # +1D4F0 ; 0067 ; MA # ( 𝓰 → g ) MATHEMATICAL BOLD SCRIPT SMALL G → LATIN SMALL LETTER G # +1D524 ; 0067 ; MA # ( 𝔤 → g ) MATHEMATICAL FRAKTUR SMALL G → LATIN SMALL LETTER G # +1D558 ; 0067 ; MA # ( 𝕘 → g ) MATHEMATICAL DOUBLE-STRUCK SMALL G → LATIN SMALL LETTER G # +1D58C ; 0067 ; MA # ( 𝖌 → g ) MATHEMATICAL BOLD FRAKTUR SMALL G → LATIN SMALL LETTER G # +1D5C0 ; 0067 ; MA # ( 𝗀 → g ) MATHEMATICAL SANS-SERIF SMALL G → LATIN SMALL LETTER G # +1D5F4 ; 0067 ; MA # ( 𝗴 → g ) MATHEMATICAL SANS-SERIF BOLD SMALL G → LATIN SMALL LETTER G # +1D628 ; 0067 ; MA # ( 𝘨 → g ) MATHEMATICAL SANS-SERIF ITALIC SMALL G → LATIN SMALL LETTER G # +1D65C ; 0067 ; MA # ( 𝙜 → g ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL G → LATIN SMALL LETTER G # +1D690 ; 0067 ; MA # ( 𝚐 → g ) MATHEMATICAL MONOSPACE SMALL G → LATIN SMALL LETTER G # +0261 ; 0067 ; MA # ( ɡ → g ) LATIN SMALL LETTER SCRIPT G → LATIN SMALL LETTER G # +1D83 ; 0067 ; MA # ( ᶃ → g ) LATIN SMALL LETTER G WITH PALATAL HOOK → LATIN SMALL LETTER G # +018D ; 0067 ; MA # ( ƍ → g ) LATIN SMALL LETTER TURNED DELTA → LATIN SMALL LETTER G # +0581 ; 0067 ; MA # ( ց → g ) ARMENIAN SMALL LETTER CO → LATIN SMALL LETTER G # + +1CCDC ; 0047 ; MA #* ( 𜳜 → G ) OUTLINED LATIN CAPITAL LETTER G → LATIN CAPITAL LETTER G # +1D406 ; 0047 ; MA # ( 𝐆 → G ) MATHEMATICAL BOLD CAPITAL G → LATIN CAPITAL LETTER G # +1D43A ; 0047 ; MA # ( 𝐺 → G ) MATHEMATICAL ITALIC CAPITAL G → LATIN CAPITAL LETTER G # +1D46E ; 0047 ; MA # ( 𝑮 → G ) MATHEMATICAL BOLD ITALIC CAPITAL G → LATIN CAPITAL LETTER G # +1D4A2 ; 0047 ; MA # ( 𝒢 → G ) MATHEMATICAL SCRIPT CAPITAL G → LATIN CAPITAL LETTER G # +1D4D6 ; 0047 ; MA # ( 𝓖 → G ) MATHEMATICAL BOLD SCRIPT CAPITAL G → LATIN CAPITAL LETTER G # +1D50A ; 0047 ; MA # ( 𝔊 → G ) MATHEMATICAL FRAKTUR CAPITAL G → LATIN CAPITAL LETTER G # +1D53E ; 0047 ; MA # ( 𝔾 → G ) MATHEMATICAL DOUBLE-STRUCK CAPITAL G → LATIN CAPITAL LETTER G # +1D572 ; 0047 ; MA # ( 𝕲 → G ) MATHEMATICAL BOLD FRAKTUR CAPITAL G → LATIN CAPITAL LETTER G # +1D5A6 ; 0047 ; MA # ( 𝖦 → G ) MATHEMATICAL SANS-SERIF CAPITAL G → LATIN CAPITAL LETTER G # +1D5DA ; 0047 ; MA # ( 𝗚 → G ) MATHEMATICAL SANS-SERIF BOLD CAPITAL G → LATIN CAPITAL LETTER G # +1D60E ; 0047 ; MA # ( 𝘎 → G ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL G → LATIN CAPITAL LETTER G # +1D642 ; 0047 ; MA # ( 𝙂 → G ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL G → LATIN CAPITAL LETTER G # +1D676 ; 0047 ; MA # ( 𝙶 → G ) MATHEMATICAL MONOSPACE CAPITAL G → LATIN CAPITAL LETTER G # +050C ; 0047 ; MA # ( Ԍ → G ) CYRILLIC CAPITAL LETTER KOMI SJE → LATIN CAPITAL LETTER G # +13C0 ; 0047 ; MA # ( Ꮐ → G ) CHEROKEE LETTER NAH → LATIN CAPITAL LETTER G # +13F3 ; 0047 ; MA # ( Ᏻ → G ) CHEROKEE LETTER YU → LATIN CAPITAL LETTER G # +A4D6 ; 0047 ; MA # ( ꓖ → G ) LISU LETTER GA → LATIN CAPITAL LETTER G # + +1DA2 ; 1D4D ; MA # ( ᶢ → ᵍ ) MODIFIER LETTER SMALL SCRIPT G → MODIFIER LETTER SMALL G # + +0260 ; 0067 0314 ; MA # ( ɠ → g̔ ) LATIN SMALL LETTER G WITH HOOK → LATIN SMALL LETTER G, COMBINING REVERSED COMMA ABOVE # + +01E7 ; 011F ; MA # ( ǧ → ğ ) LATIN SMALL LETTER G WITH CARON → LATIN SMALL LETTER G WITH BREVE # + +01E6 ; 011E ; MA # ( Ǧ → Ğ ) LATIN CAPITAL LETTER G WITH CARON → LATIN CAPITAL LETTER G WITH BREVE # + +01F5 ; 0123 ; MA # ( ǵ → ģ ) LATIN SMALL LETTER G WITH ACUTE → LATIN SMALL LETTER G WITH CEDILLA # + +01E5 ; 0067 0335 ; MA # ( ǥ → g̵ ) LATIN SMALL LETTER G WITH STROKE → LATIN SMALL LETTER G, COMBINING SHORT STROKE OVERLAY # + +01E4 ; 0047 0335 ; MA # ( Ǥ → G̵ ) LATIN CAPITAL LETTER G WITH STROKE → LATIN CAPITAL LETTER G, COMBINING SHORT STROKE OVERLAY # + +0193 ; 0047 0027 ; MA # ( Ɠ → G' ) LATIN CAPITAL LETTER G WITH HOOK → LATIN CAPITAL LETTER G, APOSTROPHE # →Gʽ→ + +050D ; 0262 ; MA # ( ԍ → ɢ ) CYRILLIC SMALL LETTER KOMI SJE → LATIN LETTER SMALL CAPITAL G # +AB90 ; 0262 ; MA # ( ꮐ → ɢ ) CHEROKEE SMALL LETTER NAH → LATIN LETTER SMALL CAPITAL G # +13FB ; 0262 ; MA # ( ᏻ → ɢ ) CHEROKEE SMALL LETTER YU → LATIN LETTER SMALL CAPITAL G # + +FF48 ; 0068 ; MA # ( h → h ) FULLWIDTH LATIN SMALL LETTER H → LATIN SMALL LETTER H # →һ→ +210E ; 0068 ; MA # ( ℎ → h ) PLANCK CONSTANT → LATIN SMALL LETTER H # +1D421 ; 0068 ; MA # ( 𝐡 → h ) MATHEMATICAL BOLD SMALL H → LATIN SMALL LETTER H # +1D489 ; 0068 ; MA # ( 𝒉 → h ) MATHEMATICAL BOLD ITALIC SMALL H → LATIN SMALL LETTER H # +1D4BD ; 0068 ; MA # ( 𝒽 → h ) MATHEMATICAL SCRIPT SMALL H → LATIN SMALL LETTER H # +1D4F1 ; 0068 ; MA # ( 𝓱 → h ) MATHEMATICAL BOLD SCRIPT SMALL H → LATIN SMALL LETTER H # +1D525 ; 0068 ; MA # ( 𝔥 → h ) MATHEMATICAL FRAKTUR SMALL H → LATIN SMALL LETTER H # +1D559 ; 0068 ; MA # ( 𝕙 → h ) MATHEMATICAL DOUBLE-STRUCK SMALL H → LATIN SMALL LETTER H # +1D58D ; 0068 ; MA # ( 𝖍 → h ) MATHEMATICAL BOLD FRAKTUR SMALL H → LATIN SMALL LETTER H # +1D5C1 ; 0068 ; MA # ( 𝗁 → h ) MATHEMATICAL SANS-SERIF SMALL H → LATIN SMALL LETTER H # +1D5F5 ; 0068 ; MA # ( 𝗵 → h ) MATHEMATICAL SANS-SERIF BOLD SMALL H → LATIN SMALL LETTER H # +1D629 ; 0068 ; MA # ( 𝘩 → h ) MATHEMATICAL SANS-SERIF ITALIC SMALL H → LATIN SMALL LETTER H # +1D65D ; 0068 ; MA # ( 𝙝 → h ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL H → LATIN SMALL LETTER H # +1D691 ; 0068 ; MA # ( 𝚑 → h ) MATHEMATICAL MONOSPACE SMALL H → LATIN SMALL LETTER H # +04BB ; 0068 ; MA # ( һ → h ) CYRILLIC SMALL LETTER SHHA → LATIN SMALL LETTER H # +0570 ; 0068 ; MA # ( հ → h ) ARMENIAN SMALL LETTER HO → LATIN SMALL LETTER H # +13C2 ; 0068 ; MA # ( Ꮒ → h ) CHEROKEE LETTER NI → LATIN SMALL LETTER H # + +FF28 ; 0048 ; MA # ( H → H ) FULLWIDTH LATIN CAPITAL LETTER H → LATIN CAPITAL LETTER H # →Η→ +210B ; 0048 ; MA # ( ℋ → H ) SCRIPT CAPITAL H → LATIN CAPITAL LETTER H # +210C ; 0048 ; MA # ( ℌ → H ) BLACK-LETTER CAPITAL H → LATIN CAPITAL LETTER H # +210D ; 0048 ; MA # ( ℍ → H ) DOUBLE-STRUCK CAPITAL H → LATIN CAPITAL LETTER H # +1CCDD ; 0048 ; MA #* ( 𜳝 → H ) OUTLINED LATIN CAPITAL LETTER H → LATIN CAPITAL LETTER H # +1D407 ; 0048 ; MA # ( 𝐇 → H ) MATHEMATICAL BOLD CAPITAL H → LATIN CAPITAL LETTER H # +1D43B ; 0048 ; MA # ( 𝐻 → H ) MATHEMATICAL ITALIC CAPITAL H → LATIN CAPITAL LETTER H # +1D46F ; 0048 ; MA # ( 𝑯 → H ) MATHEMATICAL BOLD ITALIC CAPITAL H → LATIN CAPITAL LETTER H # +1D4D7 ; 0048 ; MA # ( 𝓗 → H ) MATHEMATICAL BOLD SCRIPT CAPITAL H → LATIN CAPITAL LETTER H # +1D573 ; 0048 ; MA # ( 𝕳 → H ) MATHEMATICAL BOLD FRAKTUR CAPITAL H → LATIN CAPITAL LETTER H # +1D5A7 ; 0048 ; MA # ( 𝖧 → H ) MATHEMATICAL SANS-SERIF CAPITAL H → LATIN CAPITAL LETTER H # +1D5DB ; 0048 ; MA # ( 𝗛 → H ) MATHEMATICAL SANS-SERIF BOLD CAPITAL H → LATIN CAPITAL LETTER H # +1D60F ; 0048 ; MA # ( 𝘏 → H ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL H → LATIN CAPITAL LETTER H # +1D643 ; 0048 ; MA # ( 𝙃 → H ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL H → LATIN CAPITAL LETTER H # +1D677 ; 0048 ; MA # ( 𝙷 → H ) MATHEMATICAL MONOSPACE CAPITAL H → LATIN CAPITAL LETTER H # +0397 ; 0048 ; MA # ( Η → H ) GREEK CAPITAL LETTER ETA → LATIN CAPITAL LETTER H # +1D6AE ; 0048 ; MA # ( 𝚮 → H ) MATHEMATICAL BOLD CAPITAL ETA → LATIN CAPITAL LETTER H # →Η→ +1D6E8 ; 0048 ; MA # ( 𝛨 → H ) MATHEMATICAL ITALIC CAPITAL ETA → LATIN CAPITAL LETTER H # →Η→ +1D722 ; 0048 ; MA # ( 𝜢 → H ) MATHEMATICAL BOLD ITALIC CAPITAL ETA → LATIN CAPITAL LETTER H # →𝑯→ +1D75C ; 0048 ; MA # ( 𝝜 → H ) MATHEMATICAL SANS-SERIF BOLD CAPITAL ETA → LATIN CAPITAL LETTER H # →Η→ +1D796 ; 0048 ; MA # ( 𝞖 → H ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL ETA → LATIN CAPITAL LETTER H # →Η→ +2C8E ; 0048 ; MA # ( Ⲏ → H ) COPTIC CAPITAL LETTER HATE → LATIN CAPITAL LETTER H # →Η→ +041D ; 0048 ; MA # ( Н → H ) CYRILLIC CAPITAL LETTER EN → LATIN CAPITAL LETTER H # +13BB ; 0048 ; MA # ( Ꮋ → H ) CHEROKEE LETTER MI → LATIN CAPITAL LETTER H # +157C ; 0048 ; MA # ( ᕼ → H ) CANADIAN SYLLABICS NUNAVUT H → LATIN CAPITAL LETTER H # +A4E7 ; 0048 ; MA # ( ꓧ → H ) LISU LETTER XA → LATIN CAPITAL LETTER H # +102CF ; 0048 ; MA # ( 𐋏 → H ) CARIAN LETTER E2 → LATIN CAPITAL LETTER H # + +1D78 ; 1D34 ; MA # ( ᵸ → ᴴ ) MODIFIER LETTER CYRILLIC EN → MODIFIER LETTER CAPITAL H # + +0266 ; 0068 0314 ; MA # ( ɦ → h̔ ) LATIN SMALL LETTER H WITH HOOK → LATIN SMALL LETTER H, COMBINING REVERSED COMMA ABOVE # +A695 ; 0068 0314 ; MA # ( ꚕ → h̔ ) CYRILLIC SMALL LETTER HWE → LATIN SMALL LETTER H, COMBINING REVERSED COMMA ABOVE # →ɦ→ +13F2 ; 0068 0314 ; MA # ( Ᏺ → h̔ ) CHEROKEE LETTER YO → LATIN SMALL LETTER H, COMBINING REVERSED COMMA ABOVE # + +2C67 ; 0048 0329 ; MA # ( Ⱨ → H̩ ) LATIN CAPITAL LETTER H WITH DESCENDER → LATIN CAPITAL LETTER H, COMBINING VERTICAL LINE BELOW # →Ң→→Н̩→ +04A2 ; 0048 0329 ; MA # ( Ң → H̩ ) CYRILLIC CAPITAL LETTER EN WITH DESCENDER → LATIN CAPITAL LETTER H, COMBINING VERTICAL LINE BELOW # →Н̩→ + +0127 ; 0068 0335 ; MA # ( ħ → h̵ ) LATIN SMALL LETTER H WITH STROKE → LATIN SMALL LETTER H, COMBINING SHORT STROKE OVERLAY # +210F ; 0068 0335 ; MA # ( ℏ → h̵ ) PLANCK CONSTANT OVER TWO PI → LATIN SMALL LETTER H, COMBINING SHORT STROKE OVERLAY # →ħ→ +045B ; 0068 0335 ; MA # ( ћ → h̵ ) CYRILLIC SMALL LETTER TSHE → LATIN SMALL LETTER H, COMBINING SHORT STROKE OVERLAY # →ħ→ + +0126 ; 0048 0335 ; MA # ( Ħ → H̵ ) LATIN CAPITAL LETTER H WITH STROKE → LATIN CAPITAL LETTER H, COMBINING SHORT STROKE OVERLAY # + +04C9 ; 0048 0326 ; MA # ( Ӊ → H̦ ) CYRILLIC CAPITAL LETTER EN WITH TAIL → LATIN CAPITAL LETTER H, COMBINING COMMA BELOW # →Н̡→ +04C7 ; 0048 0326 ; MA # ( Ӈ → H̦ ) CYRILLIC CAPITAL LETTER EN WITH HOOK → LATIN CAPITAL LETTER H, COMBINING COMMA BELOW # →Н̡→ + +2C8F ; 029C ; MA # ( ⲏ → ʜ ) COPTIC SMALL LETTER HATE → LATIN LETTER SMALL CAPITAL H # →н→ +043D ; 029C ; MA # ( н → ʜ ) CYRILLIC SMALL LETTER EN → LATIN LETTER SMALL CAPITAL H # +AB8B ; 029C ; MA # ( ꮋ → ʜ ) CHEROKEE SMALL LETTER MI → LATIN LETTER SMALL CAPITAL H # + +04A3 ; 029C 0329 ; MA # ( ң → ʜ̩ ) CYRILLIC SMALL LETTER EN WITH DESCENDER → LATIN LETTER SMALL CAPITAL H, COMBINING VERTICAL LINE BELOW # →н̩→ + +04CA ; 029C 0326 ; MA # ( ӊ → ʜ̦ ) CYRILLIC SMALL LETTER EN WITH TAIL → LATIN LETTER SMALL CAPITAL H, COMBINING COMMA BELOW # →н̡→ +04C8 ; 029C 0326 ; MA # ( ӈ → ʜ̦ ) CYRILLIC SMALL LETTER EN WITH HOOK → LATIN LETTER SMALL CAPITAL H, COMBINING COMMA BELOW # →н̡→ + +050A ; 01F6 ; MA # ( Ԋ → Ƕ ) CYRILLIC CAPITAL LETTER KOMI NJE → LATIN CAPITAL LETTER HWAIR # + +AB80 ; 2C76 ; MA # ( ꮀ → ⱶ ) CHEROKEE SMALL LETTER HO → LATIN SMALL LETTER HALF H # + +0370 ; 2C75 ; MA # ( Ͱ → Ⱶ ) GREEK CAPITAL LETTER HETA → LATIN CAPITAL LETTER HALF H # →Ꮀ→ +13A8 ; 2C75 ; MA # ( Ꭸ → Ⱶ ) CHEROKEE LETTER GE → LATIN CAPITAL LETTER HALF H # →Ͱ→→Ꮀ→ +13B0 ; 2C75 ; MA # ( Ꮀ → Ⱶ ) CHEROKEE LETTER HO → LATIN CAPITAL LETTER HALF H # +A6B1 ; 2C75 ; MA # ( ꚱ → Ⱶ ) BAMUM LETTER NDAA → LATIN CAPITAL LETTER HALF H # →Ͱ→→Ꮀ→ + +A795 ; A727 ; MA # ( ꞕ → ꜧ ) LATIN SMALL LETTER H WITH PALATAL HOOK → LATIN SMALL LETTER HENG # + +02DB ; 0069 ; MA #* ( ˛ → i ) OGONEK → LATIN SMALL LETTER I # →ͺ→→ι→→ι→ +2373 ; 0069 ; MA #* ( ⍳ → i ) APL FUNCTIONAL SYMBOL IOTA → LATIN SMALL LETTER I # →ɩ→ +FF49 ; 0069 ; MA # ( i → i ) FULLWIDTH LATIN SMALL LETTER I → LATIN SMALL LETTER I # →і→ +2170 ; 0069 ; MA # ( ⅰ → i ) SMALL ROMAN NUMERAL ONE → LATIN SMALL LETTER I # +2139 ; 0069 ; MA # ( ℹ → i ) INFORMATION SOURCE → LATIN SMALL LETTER I # +2148 ; 0069 ; MA # ( ⅈ → i ) DOUBLE-STRUCK ITALIC SMALL I → LATIN SMALL LETTER I # +1D422 ; 0069 ; MA # ( 𝐢 → i ) MATHEMATICAL BOLD SMALL I → LATIN SMALL LETTER I # +1D456 ; 0069 ; MA # ( 𝑖 → i ) MATHEMATICAL ITALIC SMALL I → LATIN SMALL LETTER I # +1D48A ; 0069 ; MA # ( 𝒊 → i ) MATHEMATICAL BOLD ITALIC SMALL I → LATIN SMALL LETTER I # +1D4BE ; 0069 ; MA # ( 𝒾 → i ) MATHEMATICAL SCRIPT SMALL I → LATIN SMALL LETTER I # +1D4F2 ; 0069 ; MA # ( 𝓲 → i ) MATHEMATICAL BOLD SCRIPT SMALL I → LATIN SMALL LETTER I # +1D526 ; 0069 ; MA # ( 𝔦 → i ) MATHEMATICAL FRAKTUR SMALL I → LATIN SMALL LETTER I # +1D55A ; 0069 ; MA # ( 𝕚 → i ) MATHEMATICAL DOUBLE-STRUCK SMALL I → LATIN SMALL LETTER I # +1D58E ; 0069 ; MA # ( 𝖎 → i ) MATHEMATICAL BOLD FRAKTUR SMALL I → LATIN SMALL LETTER I # +1D5C2 ; 0069 ; MA # ( 𝗂 → i ) MATHEMATICAL SANS-SERIF SMALL I → LATIN SMALL LETTER I # +1D5F6 ; 0069 ; MA # ( 𝗶 → i ) MATHEMATICAL SANS-SERIF BOLD SMALL I → LATIN SMALL LETTER I # +1D62A ; 0069 ; MA # ( 𝘪 → i ) MATHEMATICAL SANS-SERIF ITALIC SMALL I → LATIN SMALL LETTER I # +1D65E ; 0069 ; MA # ( 𝙞 → i ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL I → LATIN SMALL LETTER I # +1D692 ; 0069 ; MA # ( 𝚒 → i ) MATHEMATICAL MONOSPACE SMALL I → LATIN SMALL LETTER I # +0131 ; 0069 ; MA # ( ı → i ) LATIN SMALL LETTER DOTLESS I → LATIN SMALL LETTER I # +1D6A4 ; 0069 ; MA # ( 𝚤 → i ) MATHEMATICAL ITALIC SMALL DOTLESS I → LATIN SMALL LETTER I # →ı→ +026A ; 0069 ; MA # ( ɪ → i ) LATIN LETTER SMALL CAPITAL I → LATIN SMALL LETTER I # →ı→ +0269 ; 0069 ; MA # ( ɩ → i ) LATIN SMALL LETTER IOTA → LATIN SMALL LETTER I # +03B9 ; 0069 ; MA # ( ι → i ) GREEK SMALL LETTER IOTA → LATIN SMALL LETTER I # +1FBE ; 0069 ; MA # ( ι → i ) GREEK PROSGEGRAMMENI → LATIN SMALL LETTER I # →ι→ +037A ; 0069 ; MA #* ( ͺ → i ) GREEK YPOGEGRAMMENI → LATIN SMALL LETTER I # →ι→→ι→ +1D6CA ; 0069 ; MA # ( 𝛊 → i ) MATHEMATICAL BOLD SMALL IOTA → LATIN SMALL LETTER I # →ι→ +1D704 ; 0069 ; MA # ( 𝜄 → i ) MATHEMATICAL ITALIC SMALL IOTA → LATIN SMALL LETTER I # →ι→ +1D73E ; 0069 ; MA # ( 𝜾 → i ) MATHEMATICAL BOLD ITALIC SMALL IOTA → LATIN SMALL LETTER I # →ι→ +1D778 ; 0069 ; MA # ( 𝝸 → i ) MATHEMATICAL SANS-SERIF BOLD SMALL IOTA → LATIN SMALL LETTER I # →ι→ +1D7B2 ; 0069 ; MA # ( 𝞲 → i ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL IOTA → LATIN SMALL LETTER I # →ι→ +2C93 ; 0069 ; MA # ( ⲓ → i ) COPTIC SMALL LETTER IAUDA → LATIN SMALL LETTER I # →ı→ +0456 ; 0069 ; MA # ( і → i ) CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I → LATIN SMALL LETTER I # +A647 ; 0069 ; MA # ( ꙇ → i ) CYRILLIC SMALL LETTER IOTA → LATIN SMALL LETTER I # →ι→ +0582 ; 0069 ; MA # ( ւ → i ) ARMENIAN SMALL LETTER YIWN → LATIN SMALL LETTER I # →ı→ +AB75 ; 0069 ; MA # ( ꭵ → i ) CHEROKEE SMALL LETTER V → LATIN SMALL LETTER I # +13A5 ; 0069 ; MA # ( Ꭵ → i ) CHEROKEE LETTER V → LATIN SMALL LETTER I # +118C3 ; 0069 ; MA # ( 𑣃 → i ) WARANG CITI SMALL LETTER YU → LATIN SMALL LETTER I # →ι→ + +24DB ; 24BE ; MA #* ( ⓛ → Ⓘ ) CIRCLED LATIN SMALL LETTER L → CIRCLED LATIN CAPITAL LETTER I # + +2378 ; 0069 0332 ; MA #* ( ⍸ → i̲ ) APL FUNCTIONAL SYMBOL IOTA UNDERBAR → LATIN SMALL LETTER I, COMBINING LOW LINE # →ι̲→ + +01D0 ; 012D ; MA # ( ǐ → ĭ ) LATIN SMALL LETTER I WITH CARON → LATIN SMALL LETTER I WITH BREVE # + +01CF ; 012C ; MA # ( Ǐ → Ĭ ) LATIN CAPITAL LETTER I WITH CARON → LATIN CAPITAL LETTER I WITH BREVE # + +0268 ; 0069 0335 ; MA # ( ɨ → i̵ ) LATIN SMALL LETTER I WITH STROKE → LATIN SMALL LETTER I, COMBINING SHORT STROKE OVERLAY # +1D7B ; 0069 0335 ; MA # ( ᵻ → i̵ ) LATIN SMALL CAPITAL LETTER I WITH STROKE → LATIN SMALL LETTER I, COMBINING SHORT STROKE OVERLAY # →ɪ̵→ +1D7C ; 0069 0335 ; MA # ( ᵼ → i̵ ) LATIN SMALL LETTER IOTA WITH STROKE → LATIN SMALL LETTER I, COMBINING SHORT STROKE OVERLAY # →ɩ̵→ + +2171 ; 0069 0069 ; MA # ( ⅱ → ii ) SMALL ROMAN NUMERAL TWO → LATIN SMALL LETTER I, LATIN SMALL LETTER I # + +2172 ; 0069 0069 0069 ; MA # ( ⅲ → iii ) SMALL ROMAN NUMERAL THREE → LATIN SMALL LETTER I, LATIN SMALL LETTER I, LATIN SMALL LETTER I # + +0133 ; 0069 006A ; MA # ( ij → ij ) LATIN SMALL LIGATURE IJ → LATIN SMALL LETTER I, LATIN SMALL LETTER J # + +2173 ; 0069 0076 ; MA # ( ⅳ → iv ) SMALL ROMAN NUMERAL FOUR → LATIN SMALL LETTER I, LATIN SMALL LETTER V # + +2178 ; 0069 0078 ; MA # ( ⅸ → ix ) SMALL ROMAN NUMERAL NINE → LATIN SMALL LETTER I, LATIN SMALL LETTER X # + +FF4A ; 006A ; MA # ( j → j ) FULLWIDTH LATIN SMALL LETTER J → LATIN SMALL LETTER J # →ϳ→ +2149 ; 006A ; MA # ( ⅉ → j ) DOUBLE-STRUCK ITALIC SMALL J → LATIN SMALL LETTER J # +1D423 ; 006A ; MA # ( 𝐣 → j ) MATHEMATICAL BOLD SMALL J → LATIN SMALL LETTER J # +1D457 ; 006A ; MA # ( 𝑗 → j ) MATHEMATICAL ITALIC SMALL J → LATIN SMALL LETTER J # +1D48B ; 006A ; MA # ( 𝒋 → j ) MATHEMATICAL BOLD ITALIC SMALL J → LATIN SMALL LETTER J # +1D4BF ; 006A ; MA # ( 𝒿 → j ) MATHEMATICAL SCRIPT SMALL J → LATIN SMALL LETTER J # +1D4F3 ; 006A ; MA # ( 𝓳 → j ) MATHEMATICAL BOLD SCRIPT SMALL J → LATIN SMALL LETTER J # +1D527 ; 006A ; MA # ( 𝔧 → j ) MATHEMATICAL FRAKTUR SMALL J → LATIN SMALL LETTER J # +1D55B ; 006A ; MA # ( 𝕛 → j ) MATHEMATICAL DOUBLE-STRUCK SMALL J → LATIN SMALL LETTER J # +1D58F ; 006A ; MA # ( 𝖏 → j ) MATHEMATICAL BOLD FRAKTUR SMALL J → LATIN SMALL LETTER J # +1D5C3 ; 006A ; MA # ( 𝗃 → j ) MATHEMATICAL SANS-SERIF SMALL J → LATIN SMALL LETTER J # +1D5F7 ; 006A ; MA # ( 𝗷 → j ) MATHEMATICAL SANS-SERIF BOLD SMALL J → LATIN SMALL LETTER J # +1D62B ; 006A ; MA # ( 𝘫 → j ) MATHEMATICAL SANS-SERIF ITALIC SMALL J → LATIN SMALL LETTER J # +1D65F ; 006A ; MA # ( 𝙟 → j ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL J → LATIN SMALL LETTER J # +1D693 ; 006A ; MA # ( 𝚓 → j ) MATHEMATICAL MONOSPACE SMALL J → LATIN SMALL LETTER J # +03F3 ; 006A ; MA # ( ϳ → j ) GREEK LETTER YOT → LATIN SMALL LETTER J # +0458 ; 006A ; MA # ( ј → j ) CYRILLIC SMALL LETTER JE → LATIN SMALL LETTER J # + +FF2A ; 004A ; MA # ( J → J ) FULLWIDTH LATIN CAPITAL LETTER J → LATIN CAPITAL LETTER J # →Ј→ +1CCDF ; 004A ; MA #* ( 𜳟 → J ) OUTLINED LATIN CAPITAL LETTER J → LATIN CAPITAL LETTER J # +1D409 ; 004A ; MA # ( 𝐉 → J ) MATHEMATICAL BOLD CAPITAL J → LATIN CAPITAL LETTER J # +1D43D ; 004A ; MA # ( 𝐽 → J ) MATHEMATICAL ITALIC CAPITAL J → LATIN CAPITAL LETTER J # +1D471 ; 004A ; MA # ( 𝑱 → J ) MATHEMATICAL BOLD ITALIC CAPITAL J → LATIN CAPITAL LETTER J # +1D4A5 ; 004A ; MA # ( 𝒥 → J ) MATHEMATICAL SCRIPT CAPITAL J → LATIN CAPITAL LETTER J # +1D4D9 ; 004A ; MA # ( 𝓙 → J ) MATHEMATICAL BOLD SCRIPT CAPITAL J → LATIN CAPITAL LETTER J # +1D50D ; 004A ; MA # ( 𝔍 → J ) MATHEMATICAL FRAKTUR CAPITAL J → LATIN CAPITAL LETTER J # +1D541 ; 004A ; MA # ( 𝕁 → J ) MATHEMATICAL DOUBLE-STRUCK CAPITAL J → LATIN CAPITAL LETTER J # +1D575 ; 004A ; MA # ( 𝕵 → J ) MATHEMATICAL BOLD FRAKTUR CAPITAL J → LATIN CAPITAL LETTER J # +1D5A9 ; 004A ; MA # ( 𝖩 → J ) MATHEMATICAL SANS-SERIF CAPITAL J → LATIN CAPITAL LETTER J # +1D5DD ; 004A ; MA # ( 𝗝 → J ) MATHEMATICAL SANS-SERIF BOLD CAPITAL J → LATIN CAPITAL LETTER J # +1D611 ; 004A ; MA # ( 𝘑 → J ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL J → LATIN CAPITAL LETTER J # +1D645 ; 004A ; MA # ( 𝙅 → J ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL J → LATIN CAPITAL LETTER J # +1D679 ; 004A ; MA # ( 𝙹 → J ) MATHEMATICAL MONOSPACE CAPITAL J → LATIN CAPITAL LETTER J # +A7B2 ; 004A ; MA # ( Ʝ → J ) LATIN CAPITAL LETTER J WITH CROSSED-TAIL → LATIN CAPITAL LETTER J # +037F ; 004A ; MA # ( Ϳ → J ) GREEK CAPITAL LETTER YOT → LATIN CAPITAL LETTER J # +0408 ; 004A ; MA # ( Ј → J ) CYRILLIC CAPITAL LETTER JE → LATIN CAPITAL LETTER J # +13AB ; 004A ; MA # ( Ꭻ → J ) CHEROKEE LETTER GU → LATIN CAPITAL LETTER J # +148D ; 004A ; MA # ( ᒍ → J ) CANADIAN SYLLABICS CO → LATIN CAPITAL LETTER J # +A4D9 ; 004A ; MA # ( ꓙ → J ) LISU LETTER JA → LATIN CAPITAL LETTER J # + +0249 ; 006A 0335 ; MA # ( ɉ → j̵ ) LATIN SMALL LETTER J WITH STROKE → LATIN SMALL LETTER J, COMBINING SHORT STROKE OVERLAY # + +0248 ; 004A 0335 ; MA # ( Ɉ → J̵ ) LATIN CAPITAL LETTER J WITH STROKE → LATIN CAPITAL LETTER J, COMBINING SHORT STROKE OVERLAY # + +1499 ; 004A 00B7 ; MA # ( ᒙ → J· ) CANADIAN SYLLABICS WEST-CREE CWO → LATIN CAPITAL LETTER J, MIDDLE DOT # →ᒍᐧ→ + +1D6A5 ; 0237 ; MA # ( 𝚥 → ȷ ) MATHEMATICAL ITALIC SMALL DOTLESS J → LATIN SMALL LETTER DOTLESS J # +0575 ; 0237 ; MA # ( յ → ȷ ) ARMENIAN SMALL LETTER YI → LATIN SMALL LETTER DOTLESS J # + +AB7B ; 1D0A ; MA # ( ꭻ → ᴊ ) CHEROKEE SMALL LETTER GU → LATIN LETTER SMALL CAPITAL J # + +1D424 ; 006B ; MA # ( 𝐤 → k ) MATHEMATICAL BOLD SMALL K → LATIN SMALL LETTER K # +1D458 ; 006B ; MA # ( 𝑘 → k ) MATHEMATICAL ITALIC SMALL K → LATIN SMALL LETTER K # +1D48C ; 006B ; MA # ( 𝒌 → k ) MATHEMATICAL BOLD ITALIC SMALL K → LATIN SMALL LETTER K # +1D4C0 ; 006B ; MA # ( 𝓀 → k ) MATHEMATICAL SCRIPT SMALL K → LATIN SMALL LETTER K # +1D4F4 ; 006B ; MA # ( 𝓴 → k ) MATHEMATICAL BOLD SCRIPT SMALL K → LATIN SMALL LETTER K # +1D528 ; 006B ; MA # ( 𝔨 → k ) MATHEMATICAL FRAKTUR SMALL K → LATIN SMALL LETTER K # +1D55C ; 006B ; MA # ( 𝕜 → k ) MATHEMATICAL DOUBLE-STRUCK SMALL K → LATIN SMALL LETTER K # +1D590 ; 006B ; MA # ( 𝖐 → k ) MATHEMATICAL BOLD FRAKTUR SMALL K → LATIN SMALL LETTER K # +1D5C4 ; 006B ; MA # ( 𝗄 → k ) MATHEMATICAL SANS-SERIF SMALL K → LATIN SMALL LETTER K # +1D5F8 ; 006B ; MA # ( 𝗸 → k ) MATHEMATICAL SANS-SERIF BOLD SMALL K → LATIN SMALL LETTER K # +1D62C ; 006B ; MA # ( 𝘬 → k ) MATHEMATICAL SANS-SERIF ITALIC SMALL K → LATIN SMALL LETTER K # +1D660 ; 006B ; MA # ( 𝙠 → k ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL K → LATIN SMALL LETTER K # +1D694 ; 006B ; MA # ( 𝚔 → k ) MATHEMATICAL MONOSPACE SMALL K → LATIN SMALL LETTER K # + +212A ; 004B ; MA # ( K → K ) KELVIN SIGN → LATIN CAPITAL LETTER K # +FF2B ; 004B ; MA # ( K → K ) FULLWIDTH LATIN CAPITAL LETTER K → LATIN CAPITAL LETTER K # →Κ→ +1CCE0 ; 004B ; MA #* ( 𜳠 → K ) OUTLINED LATIN CAPITAL LETTER K → LATIN CAPITAL LETTER K # +1D40A ; 004B ; MA # ( 𝐊 → K ) MATHEMATICAL BOLD CAPITAL K → LATIN CAPITAL LETTER K # +1D43E ; 004B ; MA # ( 𝐾 → K ) MATHEMATICAL ITALIC CAPITAL K → LATIN CAPITAL LETTER K # +1D472 ; 004B ; MA # ( 𝑲 → K ) MATHEMATICAL BOLD ITALIC CAPITAL K → LATIN CAPITAL LETTER K # +1D4A6 ; 004B ; MA # ( 𝒦 → K ) MATHEMATICAL SCRIPT CAPITAL K → LATIN CAPITAL LETTER K # +1D4DA ; 004B ; MA # ( 𝓚 → K ) MATHEMATICAL BOLD SCRIPT CAPITAL K → LATIN CAPITAL LETTER K # +1D50E ; 004B ; MA # ( 𝔎 → K ) MATHEMATICAL FRAKTUR CAPITAL K → LATIN CAPITAL LETTER K # +1D542 ; 004B ; MA # ( 𝕂 → K ) MATHEMATICAL DOUBLE-STRUCK CAPITAL K → LATIN CAPITAL LETTER K # +1D576 ; 004B ; MA # ( 𝕶 → K ) MATHEMATICAL BOLD FRAKTUR CAPITAL K → LATIN CAPITAL LETTER K # +1D5AA ; 004B ; MA # ( 𝖪 → K ) MATHEMATICAL SANS-SERIF CAPITAL K → LATIN CAPITAL LETTER K # +1D5DE ; 004B ; MA # ( 𝗞 → K ) MATHEMATICAL SANS-SERIF BOLD CAPITAL K → LATIN CAPITAL LETTER K # +1D612 ; 004B ; MA # ( 𝘒 → K ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL K → LATIN CAPITAL LETTER K # +1D646 ; 004B ; MA # ( 𝙆 → K ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL K → LATIN CAPITAL LETTER K # +1D67A ; 004B ; MA # ( 𝙺 → K ) MATHEMATICAL MONOSPACE CAPITAL K → LATIN CAPITAL LETTER K # +039A ; 004B ; MA # ( Κ → K ) GREEK CAPITAL LETTER KAPPA → LATIN CAPITAL LETTER K # +1D6B1 ; 004B ; MA # ( 𝚱 → K ) MATHEMATICAL BOLD CAPITAL KAPPA → LATIN CAPITAL LETTER K # →Κ→ +1D6EB ; 004B ; MA # ( 𝛫 → K ) MATHEMATICAL ITALIC CAPITAL KAPPA → LATIN CAPITAL LETTER K # →𝐾→ +1D725 ; 004B ; MA # ( 𝜥 → K ) MATHEMATICAL BOLD ITALIC CAPITAL KAPPA → LATIN CAPITAL LETTER K # →𝑲→ +1D75F ; 004B ; MA # ( 𝝟 → K ) MATHEMATICAL SANS-SERIF BOLD CAPITAL KAPPA → LATIN CAPITAL LETTER K # →Κ→ +1D799 ; 004B ; MA # ( 𝞙 → K ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL KAPPA → LATIN CAPITAL LETTER K # →Κ→ +2C94 ; 004B ; MA # ( Ⲕ → K ) COPTIC CAPITAL LETTER KAPA → LATIN CAPITAL LETTER K # →Κ→ +041A ; 004B ; MA # ( К → K ) CYRILLIC CAPITAL LETTER KA → LATIN CAPITAL LETTER K # +13E6 ; 004B ; MA # ( Ꮶ → K ) CHEROKEE LETTER TSO → LATIN CAPITAL LETTER K # +16D5 ; 004B ; MA # ( ᛕ → K ) RUNIC LETTER OPEN-P → LATIN CAPITAL LETTER K # +A4D7 ; 004B ; MA # ( ꓗ → K ) LISU LETTER KA → LATIN CAPITAL LETTER K # +10518 ; 004B ; MA # ( 𐔘 → K ) ELBASAN LETTER QE → LATIN CAPITAL LETTER K # + +0199 ; 006B 0314 ; MA # ( ƙ → k̔ ) LATIN SMALL LETTER K WITH HOOK → LATIN SMALL LETTER K, COMBINING REVERSED COMMA ABOVE # + +2C69 ; 004B 0329 ; MA # ( Ⱪ → K̩ ) LATIN CAPITAL LETTER K WITH DESCENDER → LATIN CAPITAL LETTER K, COMBINING VERTICAL LINE BELOW # →Қ→→К̩→ +049A ; 004B 0329 ; MA # ( Қ → K̩ ) CYRILLIC CAPITAL LETTER KA WITH DESCENDER → LATIN CAPITAL LETTER K, COMBINING VERTICAL LINE BELOW # →К̩→ + +20AD ; 004B 0335 ; MA #* ( ₭ → K̵ ) KIP SIGN → LATIN CAPITAL LETTER K, COMBINING SHORT STROKE OVERLAY # →K̶→ +A740 ; 004B 0335 ; MA # ( Ꝁ → K̵ ) LATIN CAPITAL LETTER K WITH STROKE → LATIN CAPITAL LETTER K, COMBINING SHORT STROKE OVERLAY # →Ҟ→→К̵→ +049E ; 004B 0335 ; MA # ( Ҟ → K̵ ) CYRILLIC CAPITAL LETTER KA WITH STROKE → LATIN CAPITAL LETTER K, COMBINING SHORT STROKE OVERLAY # →К̵→ + +0198 ; 004B 0027 ; MA # ( Ƙ → K' ) LATIN CAPITAL LETTER K WITH HOOK → LATIN CAPITAL LETTER K, APOSTROPHE # →Kʽ→ + +05C0 ; 006C ; MA #* ( ‎׀‎ → l ) HEBREW PUNCTUATION PASEQ → LATIN SMALL LETTER L # →|→ +007C ; 006C ; MA #* ( | → l ) VERTICAL LINE → LATIN SMALL LETTER L # +2223 ; 006C ; MA #* ( ∣ → l ) DIVIDES → LATIN SMALL LETTER L # →ǀ→ +23FD ; 006C ; MA #* ( ⏽ → l ) POWER ON SYMBOL → LATIN SMALL LETTER L # →I→ +FFE8 ; 006C ; MA #* ( │ → l ) HALFWIDTH FORMS LIGHT VERTICAL → LATIN SMALL LETTER L # →|→ +0031 ; 006C ; MA # ( 1 → l ) DIGIT ONE → LATIN SMALL LETTER L # +0661 ; 006C ; MA # ( ‎١‎ → l ) ARABIC-INDIC DIGIT ONE → LATIN SMALL LETTER L # →1→ +06F1 ; 006C ; MA # ( ۱ → l ) EXTENDED ARABIC-INDIC DIGIT ONE → LATIN SMALL LETTER L # →1→ +10320 ; 006C ; MA #* ( 𐌠 → l ) OLD ITALIC NUMERAL ONE → LATIN SMALL LETTER L # →𐌉→→I→ +1E8C7 ; 006C ; MA #* ( ‎𞣇‎ → l ) MENDE KIKAKUI DIGIT ONE → LATIN SMALL LETTER L # +1CCF1 ; 006C ; MA # ( 𜳱 → l ) OUTLINED DIGIT ONE → LATIN SMALL LETTER L # →1→ +1D7CF ; 006C ; MA # ( 𝟏 → l ) MATHEMATICAL BOLD DIGIT ONE → LATIN SMALL LETTER L # →1→ +1D7D9 ; 006C ; MA # ( 𝟙 → l ) MATHEMATICAL DOUBLE-STRUCK DIGIT ONE → LATIN SMALL LETTER L # →1→ +1D7E3 ; 006C ; MA # ( 𝟣 → l ) MATHEMATICAL SANS-SERIF DIGIT ONE → LATIN SMALL LETTER L # →1→ +1D7ED ; 006C ; MA # ( 𝟭 → l ) MATHEMATICAL SANS-SERIF BOLD DIGIT ONE → LATIN SMALL LETTER L # →1→ +1D7F7 ; 006C ; MA # ( 𝟷 → l ) MATHEMATICAL MONOSPACE DIGIT ONE → LATIN SMALL LETTER L # →1→ +1FBF1 ; 006C ; MA # ( 🯱 → l ) SEGMENTED DIGIT ONE → LATIN SMALL LETTER L # →1→ +0049 ; 006C ; MA # ( I → l ) LATIN CAPITAL LETTER I → LATIN SMALL LETTER L # +FF29 ; 006C ; MA # ( I → l ) FULLWIDTH LATIN CAPITAL LETTER I → LATIN SMALL LETTER L # →Ӏ→ +2160 ; 006C ; MA # ( Ⅰ → l ) ROMAN NUMERAL ONE → LATIN SMALL LETTER L # →Ӏ→ +2110 ; 006C ; MA # ( ℐ → l ) SCRIPT CAPITAL I → LATIN SMALL LETTER L # →I→ +2111 ; 006C ; MA # ( ℑ → l ) BLACK-LETTER CAPITAL I → LATIN SMALL LETTER L # →I→ +1CCDE ; 006C ; MA #* ( 𜳞 → l ) OUTLINED LATIN CAPITAL LETTER I → LATIN SMALL LETTER L # →I→ +1D408 ; 006C ; MA # ( 𝐈 → l ) MATHEMATICAL BOLD CAPITAL I → LATIN SMALL LETTER L # →I→ +1D43C ; 006C ; MA # ( 𝐼 → l ) MATHEMATICAL ITALIC CAPITAL I → LATIN SMALL LETTER L # →I→ +1D470 ; 006C ; MA # ( 𝑰 → l ) MATHEMATICAL BOLD ITALIC CAPITAL I → LATIN SMALL LETTER L # →I→ +1D4D8 ; 006C ; MA # ( 𝓘 → l ) MATHEMATICAL BOLD SCRIPT CAPITAL I → LATIN SMALL LETTER L # →I→ +1D540 ; 006C ; MA # ( 𝕀 → l ) MATHEMATICAL DOUBLE-STRUCK CAPITAL I → LATIN SMALL LETTER L # →I→ +1D574 ; 006C ; MA # ( 𝕴 → l ) MATHEMATICAL BOLD FRAKTUR CAPITAL I → LATIN SMALL LETTER L # →I→ +1D5A8 ; 006C ; MA # ( 𝖨 → l ) MATHEMATICAL SANS-SERIF CAPITAL I → LATIN SMALL LETTER L # →I→ +1D5DC ; 006C ; MA # ( 𝗜 → l ) MATHEMATICAL SANS-SERIF BOLD CAPITAL I → LATIN SMALL LETTER L # →I→ +1D610 ; 006C ; MA # ( 𝘐 → l ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL I → LATIN SMALL LETTER L # →I→ +1D644 ; 006C ; MA # ( 𝙄 → l ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL I → LATIN SMALL LETTER L # →I→ +1D678 ; 006C ; MA # ( 𝙸 → l ) MATHEMATICAL MONOSPACE CAPITAL I → LATIN SMALL LETTER L # →I→ +0196 ; 006C ; MA # ( Ɩ → l ) LATIN CAPITAL LETTER IOTA → LATIN SMALL LETTER L # +FF4C ; 006C ; MA # ( l → l ) FULLWIDTH LATIN SMALL LETTER L → LATIN SMALL LETTER L # →Ⅰ→→Ӏ→ +217C ; 006C ; MA # ( ⅼ → l ) SMALL ROMAN NUMERAL FIFTY → LATIN SMALL LETTER L # +2113 ; 006C ; MA # ( ℓ → l ) SCRIPT SMALL L → LATIN SMALL LETTER L # +1D425 ; 006C ; MA # ( 𝐥 → l ) MATHEMATICAL BOLD SMALL L → LATIN SMALL LETTER L # +1D459 ; 006C ; MA # ( 𝑙 → l ) MATHEMATICAL ITALIC SMALL L → LATIN SMALL LETTER L # +1D48D ; 006C ; MA # ( 𝒍 → l ) MATHEMATICAL BOLD ITALIC SMALL L → LATIN SMALL LETTER L # +1D4C1 ; 006C ; MA # ( 𝓁 → l ) MATHEMATICAL SCRIPT SMALL L → LATIN SMALL LETTER L # +1D4F5 ; 006C ; MA # ( 𝓵 → l ) MATHEMATICAL BOLD SCRIPT SMALL L → LATIN SMALL LETTER L # +1D529 ; 006C ; MA # ( 𝔩 → l ) MATHEMATICAL FRAKTUR SMALL L → LATIN SMALL LETTER L # +1D55D ; 006C ; MA # ( 𝕝 → l ) MATHEMATICAL DOUBLE-STRUCK SMALL L → LATIN SMALL LETTER L # +1D591 ; 006C ; MA # ( 𝖑 → l ) MATHEMATICAL BOLD FRAKTUR SMALL L → LATIN SMALL LETTER L # +1D5C5 ; 006C ; MA # ( 𝗅 → l ) MATHEMATICAL SANS-SERIF SMALL L → LATIN SMALL LETTER L # +1D5F9 ; 006C ; MA # ( 𝗹 → l ) MATHEMATICAL SANS-SERIF BOLD SMALL L → LATIN SMALL LETTER L # +1D62D ; 006C ; MA # ( 𝘭 → l ) MATHEMATICAL SANS-SERIF ITALIC SMALL L → LATIN SMALL LETTER L # +1D661 ; 006C ; MA # ( 𝙡 → l ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL L → LATIN SMALL LETTER L # +1D695 ; 006C ; MA # ( 𝚕 → l ) MATHEMATICAL MONOSPACE SMALL L → LATIN SMALL LETTER L # +01C0 ; 006C ; MA # ( ǀ → l ) LATIN LETTER DENTAL CLICK → LATIN SMALL LETTER L # +0399 ; 006C ; MA # ( Ι → l ) GREEK CAPITAL LETTER IOTA → LATIN SMALL LETTER L # +1D6B0 ; 006C ; MA # ( 𝚰 → l ) MATHEMATICAL BOLD CAPITAL IOTA → LATIN SMALL LETTER L # →Ι→ +1D6EA ; 006C ; MA # ( 𝛪 → l ) MATHEMATICAL ITALIC CAPITAL IOTA → LATIN SMALL LETTER L # →Ι→ +1D724 ; 006C ; MA # ( 𝜤 → l ) MATHEMATICAL BOLD ITALIC CAPITAL IOTA → LATIN SMALL LETTER L # →Ι→ +1D75E ; 006C ; MA # ( 𝝞 → l ) MATHEMATICAL SANS-SERIF BOLD CAPITAL IOTA → LATIN SMALL LETTER L # →Ι→ +1D798 ; 006C ; MA # ( 𝞘 → l ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL IOTA → LATIN SMALL LETTER L # →Ι→ +2C92 ; 006C ; MA # ( Ⲓ → l ) COPTIC CAPITAL LETTER IAUDA → LATIN SMALL LETTER L # →Ӏ→ +0406 ; 006C ; MA # ( І → l ) CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I → LATIN SMALL LETTER L # +04CF ; 006C ; MA # ( ӏ → l ) CYRILLIC SMALL LETTER PALOCHKA → LATIN SMALL LETTER L # →I→ +04C0 ; 006C ; MA # ( Ӏ → l ) CYRILLIC LETTER PALOCHKA → LATIN SMALL LETTER L # +05D5 ; 006C ; MA # ( ‎ו‎ → l ) HEBREW LETTER VAV → LATIN SMALL LETTER L # +05DF ; 006C ; MA # ( ‎ן‎ → l ) HEBREW LETTER FINAL NUN → LATIN SMALL LETTER L # +0627 ; 006C ; MA # ( ‎ا‎ → l ) ARABIC LETTER ALEF → LATIN SMALL LETTER L # →1→ +1EE00 ; 006C ; MA # ( ‎𞸀‎ → l ) ARABIC MATHEMATICAL ALEF → LATIN SMALL LETTER L # →‎ا‎→→1→ +1EE80 ; 006C ; MA # ( ‎𞺀‎ → l ) ARABIC MATHEMATICAL LOOPED ALEF → LATIN SMALL LETTER L # →‎ا‎→→1→ +FE8E ; 006C ; MA # ( ‎ﺎ‎ → l ) ARABIC LETTER ALEF FINAL FORM → LATIN SMALL LETTER L # →‎ا‎→→1→ +FE8D ; 006C ; MA # ( ‎ﺍ‎ → l ) ARABIC LETTER ALEF ISOLATED FORM → LATIN SMALL LETTER L # →‎ا‎→→1→ +07CA ; 006C ; MA # ( ‎ߊ‎ → l ) NKO LETTER A → LATIN SMALL LETTER L # →∣→→ǀ→ +2D4F ; 006C ; MA # ( ⵏ → l ) TIFINAGH LETTER YAN → LATIN SMALL LETTER L # →Ӏ→ +16C1 ; 006C ; MA # ( ᛁ → l ) RUNIC LETTER ISAZ IS ISS I → LATIN SMALL LETTER L # →I→ +A4F2 ; 006C ; MA # ( ꓲ → l ) LISU LETTER I → LATIN SMALL LETTER L # →I→ +16F28 ; 006C ; MA # ( 𖼨 → l ) MIAO LETTER GHA → LATIN SMALL LETTER L # →I→ +1028A ; 006C ; MA # ( 𐊊 → l ) LYCIAN LETTER J → LATIN SMALL LETTER L # →I→ +10309 ; 006C ; MA # ( 𐌉 → l ) OLD ITALIC LETTER I → LATIN SMALL LETTER L # →I→ +11DDA ; 006C ; MA # ( 𑷚 → l ) TOLONG SIKI SIGN HECAKA → LATIN SMALL LETTER L # →|→ +11DE1 ; 006C ; MA # ( 𑷡 → l ) TOLONG SIKI DIGIT ONE → LATIN SMALL LETTER L # →|→ +16EAA ; 006C ; MA # ( 𖺪 → l ) BERIA ERFE CAPITAL LETTER LAKKO → LATIN SMALL LETTER L # →I→ + +1D22A ; 004C ; MA #* ( 𝈪 → L ) GREEK INSTRUMENTAL NOTATION SYMBOL-23 → LATIN CAPITAL LETTER L # +216C ; 004C ; MA # ( Ⅼ → L ) ROMAN NUMERAL FIFTY → LATIN CAPITAL LETTER L # +2112 ; 004C ; MA # ( ℒ → L ) SCRIPT CAPITAL L → LATIN CAPITAL LETTER L # +1CCE1 ; 004C ; MA #* ( 𜳡 → L ) OUTLINED LATIN CAPITAL LETTER L → LATIN CAPITAL LETTER L # +1D40B ; 004C ; MA # ( 𝐋 → L ) MATHEMATICAL BOLD CAPITAL L → LATIN CAPITAL LETTER L # +1D43F ; 004C ; MA # ( 𝐿 → L ) MATHEMATICAL ITALIC CAPITAL L → LATIN CAPITAL LETTER L # +1D473 ; 004C ; MA # ( 𝑳 → L ) MATHEMATICAL BOLD ITALIC CAPITAL L → LATIN CAPITAL LETTER L # +1D4DB ; 004C ; MA # ( 𝓛 → L ) MATHEMATICAL BOLD SCRIPT CAPITAL L → LATIN CAPITAL LETTER L # +1D50F ; 004C ; MA # ( 𝔏 → L ) MATHEMATICAL FRAKTUR CAPITAL L → LATIN CAPITAL LETTER L # +1D543 ; 004C ; MA # ( 𝕃 → L ) MATHEMATICAL DOUBLE-STRUCK CAPITAL L → LATIN CAPITAL LETTER L # +1D577 ; 004C ; MA # ( 𝕷 → L ) MATHEMATICAL BOLD FRAKTUR CAPITAL L → LATIN CAPITAL LETTER L # +1D5AB ; 004C ; MA # ( 𝖫 → L ) MATHEMATICAL SANS-SERIF CAPITAL L → LATIN CAPITAL LETTER L # +1D5DF ; 004C ; MA # ( 𝗟 → L ) MATHEMATICAL SANS-SERIF BOLD CAPITAL L → LATIN CAPITAL LETTER L # +1D613 ; 004C ; MA # ( 𝘓 → L ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL L → LATIN CAPITAL LETTER L # +1D647 ; 004C ; MA # ( 𝙇 → L ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL L → LATIN CAPITAL LETTER L # +1D67B ; 004C ; MA # ( 𝙻 → L ) MATHEMATICAL MONOSPACE CAPITAL L → LATIN CAPITAL LETTER L # +2CD0 ; 004C ; MA # ( Ⳑ → L ) COPTIC CAPITAL LETTER L-SHAPED HA → LATIN CAPITAL LETTER L # +13DE ; 004C ; MA # ( Ꮮ → L ) CHEROKEE LETTER TLE → LATIN CAPITAL LETTER L # +14AA ; 004C ; MA # ( ᒪ → L ) CANADIAN SYLLABICS MA → LATIN CAPITAL LETTER L # +A4E1 ; 004C ; MA # ( ꓡ → L ) LISU LETTER LA → LATIN CAPITAL LETTER L # +16F16 ; 004C ; MA # ( 𖼖 → L ) MIAO LETTER LA → LATIN CAPITAL LETTER L # +118A3 ; 004C ; MA # ( 𑢣 → L ) WARANG CITI CAPITAL LETTER YU → LATIN CAPITAL LETTER L # +118B2 ; 004C ; MA # ( 𑢲 → L ) WARANG CITI CAPITAL LETTER TTE → LATIN CAPITAL LETTER L # +1041B ; 004C ; MA # ( 𐐛 → L ) DESERET CAPITAL LETTER ETH → LATIN CAPITAL LETTER L # +10526 ; 004C ; MA # ( 𐔦 → L ) ELBASAN LETTER GHAMMA → LATIN CAPITAL LETTER L # + +FD3C ; 006C 030B ; MA # ( ‎ﴼ‎ → l̋ ) ARABIC LIGATURE ALEF WITH FATHATAN FINAL FORM → LATIN SMALL LETTER L, COMBINING DOUBLE ACUTE ACCENT # →‎اً‎→ +FD3D ; 006C 030B ; MA # ( ‎ﴽ‎ → l̋ ) ARABIC LIGATURE ALEF WITH FATHATAN ISOLATED FORM → LATIN SMALL LETTER L, COMBINING DOUBLE ACUTE ACCENT # →‎اً‎→ + +0142 ; 006C 0338 ; MA # ( ł → l̸ ) LATIN SMALL LETTER L WITH STROKE → LATIN SMALL LETTER L, COMBINING LONG SOLIDUS OVERLAY # →l̷→ + +0141 ; 004C 0338 ; MA # ( Ł → L̸ ) LATIN CAPITAL LETTER L WITH STROKE → LATIN CAPITAL LETTER L, COMBINING LONG SOLIDUS OVERLAY # →L̷→ + +026D ; 006C 0328 ; MA # ( ɭ → l̨ ) LATIN SMALL LETTER L WITH RETROFLEX HOOK → LATIN SMALL LETTER L, COMBINING OGONEK # →l̢→ + +0197 ; 006C 0335 ; MA # ( Ɨ → l̵ ) LATIN CAPITAL LETTER I WITH STROKE → LATIN SMALL LETTER L, COMBINING SHORT STROKE OVERLAY # →ƚ→ +019A ; 006C 0335 ; MA # ( ƚ → l̵ ) LATIN SMALL LETTER L WITH BAR → LATIN SMALL LETTER L, COMBINING SHORT STROKE OVERLAY # + +026B ; 006C 0334 ; MA # ( ɫ → l̴ ) LATIN SMALL LETTER L WITH MIDDLE TILDE → LATIN SMALL LETTER L, COMBINING TILDE OVERLAY # + +0625 ; 006C 0655 ; MA # ( ‎إ‎ → lٕ ) ARABIC LETTER ALEF WITH HAMZA BELOW → LATIN SMALL LETTER L, ARABIC HAMZA BELOW # →‎ٳ‎→→‎اٟ‎→ +FE88 ; 006C 0655 ; MA # ( ‎ﺈ‎ → lٕ ) ARABIC LETTER ALEF WITH HAMZA BELOW FINAL FORM → LATIN SMALL LETTER L, ARABIC HAMZA BELOW # →‎إ‎→→‎ٳ‎→→‎اٟ‎→ +FE87 ; 006C 0655 ; MA # ( ‎ﺇ‎ → lٕ ) ARABIC LETTER ALEF WITH HAMZA BELOW ISOLATED FORM → LATIN SMALL LETTER L, ARABIC HAMZA BELOW # →‎إ‎→→‎ٳ‎→→‎اٟ‎→ +0673 ; 006C 0655 ; MA # ( ‎ٳ‎ → lٕ ) ARABIC LETTER ALEF WITH WAVY HAMZA BELOW → LATIN SMALL LETTER L, ARABIC HAMZA BELOW # →‎اٟ‎→ + +0140 ; 006C 00B7 ; MA # ( ŀ → l· ) LATIN SMALL LETTER L WITH MIDDLE DOT → LATIN SMALL LETTER L, MIDDLE DOT # +013F ; 006C 00B7 ; MA # ( Ŀ → l· ) LATIN CAPITAL LETTER L WITH MIDDLE DOT → LATIN SMALL LETTER L, MIDDLE DOT # →L·→→ᒪ·→→ᒪᐧ→→ᒷ→→1ᐧ→ +14B7 ; 006C 00B7 ; MA # ( ᒷ → l· ) CANADIAN SYLLABICS WEST-CREE MWA → LATIN SMALL LETTER L, MIDDLE DOT # →1ᐧ→ + +1F102 ; 006C 002C ; MA #* ( 🄂 → l, ) DIGIT ONE COMMA → LATIN SMALL LETTER L, COMMA # →1,→ + +2488 ; 006C 002E ; MA #* ( ⒈ → l. ) DIGIT ONE FULL STOP → LATIN SMALL LETTER L, FULL STOP # →1.→ + +05F1 ; 006C 0027 ; MA # ( ‎ױ‎ → l' ) HEBREW LIGATURE YIDDISH VAV YOD → LATIN SMALL LETTER L, APOSTROPHE # →‎וי‎→ + +2493 ; 006C 0032 002E ; MA #* ( ⒓ → l2. ) NUMBER TWELVE FULL STOP → LATIN SMALL LETTER L, DIGIT TWO, FULL STOP # →12.→ + +33EB ; 006C 0032 65E5 ; MA #* ( ㏫ → l2日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWELVE → LATIN SMALL LETTER L, DIGIT TWO, CJK UNIFIED IDEOGRAPH-65E5 # →12日→ + +32CB ; 006C 0032 6708 ; MA #* ( ㋋ → l2月 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DECEMBER → LATIN SMALL LETTER L, DIGIT TWO, CJK UNIFIED IDEOGRAPH-6708 # →12月→ + +3364 ; 006C 0032 70B9 ; MA #* ( ㍤ → l2点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TWELVE → LATIN SMALL LETTER L, DIGIT TWO, CJK UNIFIED IDEOGRAPH-70B9 # →12点→ + +2494 ; 006C 0033 002E ; MA #* ( ⒔ → l3. ) NUMBER THIRTEEN FULL STOP → LATIN SMALL LETTER L, DIGIT THREE, FULL STOP # →13.→ + +33EC ; 006C 0033 65E5 ; MA #* ( ㏬ → l3日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY THIRTEEN → LATIN SMALL LETTER L, DIGIT THREE, CJK UNIFIED IDEOGRAPH-65E5 # →13日→ + +3365 ; 006C 0033 70B9 ; MA #* ( ㍥ → l3点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR THIRTEEN → LATIN SMALL LETTER L, DIGIT THREE, CJK UNIFIED IDEOGRAPH-70B9 # →13点→ + +2495 ; 006C 0034 002E ; MA #* ( ⒕ → l4. ) NUMBER FOURTEEN FULL STOP → LATIN SMALL LETTER L, DIGIT FOUR, FULL STOP # →14.→ + +33ED ; 006C 0034 65E5 ; MA #* ( ㏭ → l4日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY FOURTEEN → LATIN SMALL LETTER L, DIGIT FOUR, CJK UNIFIED IDEOGRAPH-65E5 # →14日→ + +3366 ; 006C 0034 70B9 ; MA #* ( ㍦ → l4点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR FOURTEEN → LATIN SMALL LETTER L, DIGIT FOUR, CJK UNIFIED IDEOGRAPH-70B9 # →14点→ + +2496 ; 006C 0035 002E ; MA #* ( ⒖ → l5. ) NUMBER FIFTEEN FULL STOP → LATIN SMALL LETTER L, DIGIT FIVE, FULL STOP # →15.→ + +33EE ; 006C 0035 65E5 ; MA #* ( ㏮ → l5日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY FIFTEEN → LATIN SMALL LETTER L, DIGIT FIVE, CJK UNIFIED IDEOGRAPH-65E5 # →15日→ + +3367 ; 006C 0035 70B9 ; MA #* ( ㍧ → l5点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR FIFTEEN → LATIN SMALL LETTER L, DIGIT FIVE, CJK UNIFIED IDEOGRAPH-70B9 # →15点→ + +2497 ; 006C 0036 002E ; MA #* ( ⒗ → l6. ) NUMBER SIXTEEN FULL STOP → LATIN SMALL LETTER L, DIGIT SIX, FULL STOP # →16.→ + +33EF ; 006C 0036 65E5 ; MA #* ( ㏯ → l6日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY SIXTEEN → LATIN SMALL LETTER L, DIGIT SIX, CJK UNIFIED IDEOGRAPH-65E5 # →16日→ + +3368 ; 006C 0036 70B9 ; MA #* ( ㍨ → l6点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR SIXTEEN → LATIN SMALL LETTER L, DIGIT SIX, CJK UNIFIED IDEOGRAPH-70B9 # →16点→ + +2498 ; 006C 0037 002E ; MA #* ( ⒘ → l7. ) NUMBER SEVENTEEN FULL STOP → LATIN SMALL LETTER L, DIGIT SEVEN, FULL STOP # →17.→ + +33F0 ; 006C 0037 65E5 ; MA #* ( ㏰ → l7日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY SEVENTEEN → LATIN SMALL LETTER L, DIGIT SEVEN, CJK UNIFIED IDEOGRAPH-65E5 # →17日→ + +3369 ; 006C 0037 70B9 ; MA #* ( ㍩ → l7点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR SEVENTEEN → LATIN SMALL LETTER L, DIGIT SEVEN, CJK UNIFIED IDEOGRAPH-70B9 # →17点→ + +2499 ; 006C 0038 002E ; MA #* ( ⒙ → l8. ) NUMBER EIGHTEEN FULL STOP → LATIN SMALL LETTER L, DIGIT EIGHT, FULL STOP # →18.→ + +33F1 ; 006C 0038 65E5 ; MA #* ( ㏱ → l8日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY EIGHTEEN → LATIN SMALL LETTER L, DIGIT EIGHT, CJK UNIFIED IDEOGRAPH-65E5 # →18日→ + +336A ; 006C 0038 70B9 ; MA #* ( ㍪ → l8点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR EIGHTEEN → LATIN SMALL LETTER L, DIGIT EIGHT, CJK UNIFIED IDEOGRAPH-70B9 # →18点→ + +249A ; 006C 0039 002E ; MA #* ( ⒚ → l9. ) NUMBER NINETEEN FULL STOP → LATIN SMALL LETTER L, DIGIT NINE, FULL STOP # →19.→ + +33F2 ; 006C 0039 65E5 ; MA #* ( ㏲ → l9日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY NINETEEN → LATIN SMALL LETTER L, DIGIT NINE, CJK UNIFIED IDEOGRAPH-65E5 # →19日→ + +336B ; 006C 0039 70B9 ; MA #* ( ㍫ → l9点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR NINETEEN → LATIN SMALL LETTER L, DIGIT NINE, CJK UNIFIED IDEOGRAPH-70B9 # →19点→ + +01C9 ; 006C 006A ; MA # ( lj → lj ) LATIN SMALL LETTER LJ → LATIN SMALL LETTER L, LATIN SMALL LETTER J # + +0132 ; 006C 004A ; MA # ( IJ → lJ ) LATIN CAPITAL LIGATURE IJ → LATIN SMALL LETTER L, LATIN CAPITAL LETTER J # →IJ→ + +01C8 ; 004C 006A ; MA # ( Lj → Lj ) LATIN CAPITAL LETTER L WITH SMALL LETTER J → LATIN CAPITAL LETTER L, LATIN SMALL LETTER J # + +01C7 ; 004C 004A ; MA # ( LJ → LJ ) LATIN CAPITAL LETTER LJ → LATIN CAPITAL LETTER L, LATIN CAPITAL LETTER J # + +2016 ; 006C 006C ; MA #* ( ‖ → ll ) DOUBLE VERTICAL LINE → LATIN SMALL LETTER L, LATIN SMALL LETTER L # →∥→→||→ +2225 ; 006C 006C ; MA #* ( ∥ → ll ) PARALLEL TO → LATIN SMALL LETTER L, LATIN SMALL LETTER L # →||→ +2161 ; 006C 006C ; MA # ( Ⅱ → ll ) ROMAN NUMERAL TWO → LATIN SMALL LETTER L, LATIN SMALL LETTER L # →II→ +01C1 ; 006C 006C ; MA # ( ǁ → ll ) LATIN LETTER LATERAL CLICK → LATIN SMALL LETTER L, LATIN SMALL LETTER L # →‖→→∥→→||→ +05F0 ; 006C 006C ; MA # ( ‎װ‎ → ll ) HEBREW LIGATURE YIDDISH DOUBLE VAV → LATIN SMALL LETTER L, LATIN SMALL LETTER L # →‎וו‎→ + +10199 ; 006C 0335 006C 0335 ; MA #* ( 𐆙 → l̵l̵ ) ROMAN DUPONDIUS SIGN → LATIN SMALL LETTER L, COMBINING SHORT STROKE OVERLAY, LATIN SMALL LETTER L, COMBINING SHORT STROKE OVERLAY # →I̶I̶→ + +2492 ; 006C 006C 002E ; MA #* ( ⒒ → ll. ) NUMBER ELEVEN FULL STOP → LATIN SMALL LETTER L, LATIN SMALL LETTER L, FULL STOP # →11.→ + +2162 ; 006C 006C 006C ; MA # ( Ⅲ → lll ) ROMAN NUMERAL THREE → LATIN SMALL LETTER L, LATIN SMALL LETTER L, LATIN SMALL LETTER L # →III→ + +10198 ; 006C 0335 006C 0335 0053 0335 ; MA #* ( 𐆘 → l̵l̵S̵ ) ROMAN SESTERTIUS SIGN → LATIN SMALL LETTER L, COMBINING SHORT STROKE OVERLAY, LATIN SMALL LETTER L, COMBINING SHORT STROKE OVERLAY, LATIN CAPITAL LETTER S, COMBINING SHORT STROKE OVERLAY # →I̶I̶S̶→ + +33EA ; 006C 006C 65E5 ; MA #* ( ㏪ → ll日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY ELEVEN → LATIN SMALL LETTER L, LATIN SMALL LETTER L, CJK UNIFIED IDEOGRAPH-65E5 # →11日→ + +32CA ; 006C 006C 6708 ; MA #* ( ㋊ → ll月 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR NOVEMBER → LATIN SMALL LETTER L, LATIN SMALL LETTER L, CJK UNIFIED IDEOGRAPH-6708 # →11月→ + +3363 ; 006C 006C 70B9 ; MA #* ( ㍣ → ll点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR ELEVEN → LATIN SMALL LETTER L, LATIN SMALL LETTER L, CJK UNIFIED IDEOGRAPH-70B9 # →11点→ + +042E ; 006C 004F ; MA # ( Ю → lO ) CYRILLIC CAPITAL LETTER YU → LATIN SMALL LETTER L, LATIN CAPITAL LETTER O # →IO→ + +2491 ; 006C 004F 002E ; MA #* ( ⒑ → lO. ) NUMBER TEN FULL STOP → LATIN SMALL LETTER L, LATIN CAPITAL LETTER O, FULL STOP # →10.→ + +33E9 ; 006C 004F 65E5 ; MA #* ( ㏩ → lO日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TEN → LATIN SMALL LETTER L, LATIN CAPITAL LETTER O, CJK UNIFIED IDEOGRAPH-65E5 # →10日→ + +32C9 ; 006C 004F 6708 ; MA #* ( ㋉ → lO月 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR OCTOBER → LATIN SMALL LETTER L, LATIN CAPITAL LETTER O, CJK UNIFIED IDEOGRAPH-6708 # →10月→ + +3362 ; 006C 004F 70B9 ; MA #* ( ㍢ → lO点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TEN → LATIN SMALL LETTER L, LATIN CAPITAL LETTER O, CJK UNIFIED IDEOGRAPH-70B9 # →10点→ + +02AA ; 006C 0073 ; MA # ( ʪ → ls ) LATIN SMALL LETTER LS DIGRAPH → LATIN SMALL LETTER L, LATIN SMALL LETTER S # + +20B6 ; 006C 0074 ; MA #* ( ₶ → lt ) LIVRE TOURNOIS SIGN → LATIN SMALL LETTER L, LATIN SMALL LETTER T # + +2163 ; 006C 0056 ; MA # ( Ⅳ → lV ) ROMAN NUMERAL FOUR → LATIN SMALL LETTER L, LATIN CAPITAL LETTER V # →IV→ + +2168 ; 006C 0058 ; MA # ( Ⅸ → lX ) ROMAN NUMERAL NINE → LATIN SMALL LETTER L, LATIN CAPITAL LETTER X # →IX→ + +026E ; 006C 021D ; MA # ( ɮ → lȝ ) LATIN SMALL LETTER LEZH → LATIN SMALL LETTER L, LATIN SMALL LETTER YOGH # →lʒ→ + +02AB ; 006C 007A ; MA # ( ʫ → lz ) LATIN SMALL LETTER LZ DIGRAPH → LATIN SMALL LETTER L, LATIN SMALL LETTER Z # + +0675 ; 006C 0674 ; MA # ( ‎ٵ‎ → ‎lٴ‎ ) ARABIC LETTER HIGH HAMZA ALEF → LATIN SMALL LETTER L, ARABIC LETTER HIGH HAMZA # →‎اٴ‎→ +0623 ; 006C 0674 ; MA # ( ‎أ‎ → ‎lٴ‎ ) ARABIC LETTER ALEF WITH HAMZA ABOVE → LATIN SMALL LETTER L, ARABIC LETTER HIGH HAMZA # →‎ٵ‎→→‎اٴ‎→ +FE84 ; 006C 0674 ; MA # ( ‎ﺄ‎ → ‎lٴ‎ ) ARABIC LETTER ALEF WITH HAMZA ABOVE FINAL FORM → LATIN SMALL LETTER L, ARABIC LETTER HIGH HAMZA # →‎أ‎→→‎ٵ‎→→‎اٴ‎→ +FE83 ; 006C 0674 ; MA # ( ‎ﺃ‎ → ‎lٴ‎ ) ARABIC LETTER ALEF WITH HAMZA ABOVE ISOLATED FORM → LATIN SMALL LETTER L, ARABIC LETTER HIGH HAMZA # →‎ٵ‎→→‎اٴ‎→ +0672 ; 006C 0674 ; MA # ( ‎ٲ‎ → ‎lٴ‎ ) ARABIC LETTER ALEF WITH WAVY HAMZA ABOVE → LATIN SMALL LETTER L, ARABIC LETTER HIGH HAMZA # →‎أ‎→→‎ٵ‎→→‎اٴ‎→ + +FDF3 ; 006C 0643 0628 0631 ; MA # ( ‎ﷳ‎ → ‎lكبر‎ ) ARABIC LIGATURE AKBAR ISOLATED FORM → LATIN SMALL LETTER L, ARABIC LETTER KAF, ARABIC LETTER BEH, ARABIC LETTER REH # →‎اكبر‎→ + +FDF2 ; 006C 0644 0644 0651 0670 006F ; MA # ( ‎ﷲ‎ → ‎lللّٰo‎ ) ARABIC LIGATURE ALLAH ISOLATED FORM → LATIN SMALL LETTER L, ARABIC LETTER LAM, ARABIC LETTER LAM, ARABIC SHADDA, ARABIC LETTER SUPERSCRIPT ALEF, LATIN SMALL LETTER O # →‎اللّٰه‎→ + +33E0 ; 006C 65E5 ; MA #* ( ㏠ → l日 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY ONE → LATIN SMALL LETTER L, CJK UNIFIED IDEOGRAPH-65E5 # →1日→ + +32C0 ; 006C 6708 ; MA #* ( ㋀ → l月 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR JANUARY → LATIN SMALL LETTER L, CJK UNIFIED IDEOGRAPH-6708 # →1月→ + +3359 ; 006C 70B9 ; MA #* ( ㍙ → l点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR ONE → LATIN SMALL LETTER L, CJK UNIFIED IDEOGRAPH-70B9 # →1点→ + +2CD1 ; 029F ; MA # ( ⳑ → ʟ ) COPTIC SMALL LETTER L-SHAPED HA → LATIN LETTER SMALL CAPITAL L # +ABAE ; 029F ; MA # ( ꮮ → ʟ ) CHEROKEE SMALL LETTER TLE → LATIN LETTER SMALL CAPITAL L # +10443 ; 029F ; MA # ( 𐑃 → ʟ ) DESERET SMALL LETTER ETH → LATIN LETTER SMALL CAPITAL L # + +FF2D ; 004D ; MA # ( M → M ) FULLWIDTH LATIN CAPITAL LETTER M → LATIN CAPITAL LETTER M # →Μ→ +216F ; 004D ; MA # ( Ⅿ → M ) ROMAN NUMERAL ONE THOUSAND → LATIN CAPITAL LETTER M # +2133 ; 004D ; MA # ( ℳ → M ) SCRIPT CAPITAL M → LATIN CAPITAL LETTER M # +1CCE2 ; 004D ; MA #* ( 𜳢 → M ) OUTLINED LATIN CAPITAL LETTER M → LATIN CAPITAL LETTER M # +1D40C ; 004D ; MA # ( 𝐌 → M ) MATHEMATICAL BOLD CAPITAL M → LATIN CAPITAL LETTER M # +1D440 ; 004D ; MA # ( 𝑀 → M ) MATHEMATICAL ITALIC CAPITAL M → LATIN CAPITAL LETTER M # +1D474 ; 004D ; MA # ( 𝑴 → M ) MATHEMATICAL BOLD ITALIC CAPITAL M → LATIN CAPITAL LETTER M # +1D4DC ; 004D ; MA # ( 𝓜 → M ) MATHEMATICAL BOLD SCRIPT CAPITAL M → LATIN CAPITAL LETTER M # +1D510 ; 004D ; MA # ( 𝔐 → M ) MATHEMATICAL FRAKTUR CAPITAL M → LATIN CAPITAL LETTER M # +1D544 ; 004D ; MA # ( 𝕄 → M ) MATHEMATICAL DOUBLE-STRUCK CAPITAL M → LATIN CAPITAL LETTER M # +1D578 ; 004D ; MA # ( 𝕸 → M ) MATHEMATICAL BOLD FRAKTUR CAPITAL M → LATIN CAPITAL LETTER M # +1D5AC ; 004D ; MA # ( 𝖬 → M ) MATHEMATICAL SANS-SERIF CAPITAL M → LATIN CAPITAL LETTER M # +1D5E0 ; 004D ; MA # ( 𝗠 → M ) MATHEMATICAL SANS-SERIF BOLD CAPITAL M → LATIN CAPITAL LETTER M # +1D614 ; 004D ; MA # ( 𝘔 → M ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL M → LATIN CAPITAL LETTER M # +1D648 ; 004D ; MA # ( 𝙈 → M ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL M → LATIN CAPITAL LETTER M # +1D67C ; 004D ; MA # ( 𝙼 → M ) MATHEMATICAL MONOSPACE CAPITAL M → LATIN CAPITAL LETTER M # +039C ; 004D ; MA # ( Μ → M ) GREEK CAPITAL LETTER MU → LATIN CAPITAL LETTER M # +1D6B3 ; 004D ; MA # ( 𝚳 → M ) MATHEMATICAL BOLD CAPITAL MU → LATIN CAPITAL LETTER M # →𝐌→ +1D6ED ; 004D ; MA # ( 𝛭 → M ) MATHEMATICAL ITALIC CAPITAL MU → LATIN CAPITAL LETTER M # →𝑀→ +1D727 ; 004D ; MA # ( 𝜧 → M ) MATHEMATICAL BOLD ITALIC CAPITAL MU → LATIN CAPITAL LETTER M # →𝑴→ +1D761 ; 004D ; MA # ( 𝝡 → M ) MATHEMATICAL SANS-SERIF BOLD CAPITAL MU → LATIN CAPITAL LETTER M # →Μ→ +1D79B ; 004D ; MA # ( 𝞛 → M ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL MU → LATIN CAPITAL LETTER M # →Μ→ +03FA ; 004D ; MA # ( Ϻ → M ) GREEK CAPITAL LETTER SAN → LATIN CAPITAL LETTER M # +2C98 ; 004D ; MA # ( Ⲙ → M ) COPTIC CAPITAL LETTER MI → LATIN CAPITAL LETTER M # +041C ; 004D ; MA # ( М → M ) CYRILLIC CAPITAL LETTER EM → LATIN CAPITAL LETTER M # +13B7 ; 004D ; MA # ( Ꮇ → M ) CHEROKEE LETTER LU → LATIN CAPITAL LETTER M # +15F0 ; 004D ; MA # ( ᗰ → M ) CANADIAN SYLLABICS CARRIER GO → LATIN CAPITAL LETTER M # +16D6 ; 004D ; MA # ( ᛖ → M ) RUNIC LETTER EHWAZ EH E → LATIN CAPITAL LETTER M # +A4DF ; 004D ; MA # ( ꓟ → M ) LISU LETTER MA → LATIN CAPITAL LETTER M # +102B0 ; 004D ; MA # ( 𐊰 → M ) CARIAN LETTER S → LATIN CAPITAL LETTER M # +10311 ; 004D ; MA # ( 𐌑 → M ) OLD ITALIC LETTER SHE → LATIN CAPITAL LETTER M # + +04CD ; 004D 0326 ; MA # ( Ӎ → M̦ ) CYRILLIC CAPITAL LETTER EM WITH TAIL → LATIN CAPITAL LETTER M, COMBINING COMMA BELOW # →М̡→ + +1F76B ; 004D 0042 ; MA #* ( 🝫 → MB ) ALCHEMICAL SYMBOL FOR BATH OF MARY → LATIN CAPITAL LETTER M, LATIN CAPITAL LETTER B # + +2DE8 ; 1DDF ; MA # ( ⷨ → ᷟ ) COMBINING CYRILLIC LETTER EM → COMBINING LATIN LETTER SMALL CAPITAL M # + +1D427 ; 006E ; MA # ( 𝐧 → n ) MATHEMATICAL BOLD SMALL N → LATIN SMALL LETTER N # +1D45B ; 006E ; MA # ( 𝑛 → n ) MATHEMATICAL ITALIC SMALL N → LATIN SMALL LETTER N # +1D48F ; 006E ; MA # ( 𝒏 → n ) MATHEMATICAL BOLD ITALIC SMALL N → LATIN SMALL LETTER N # +1D4C3 ; 006E ; MA # ( 𝓃 → n ) MATHEMATICAL SCRIPT SMALL N → LATIN SMALL LETTER N # +1D4F7 ; 006E ; MA # ( 𝓷 → n ) MATHEMATICAL BOLD SCRIPT SMALL N → LATIN SMALL LETTER N # +1D52B ; 006E ; MA # ( 𝔫 → n ) MATHEMATICAL FRAKTUR SMALL N → LATIN SMALL LETTER N # +1D55F ; 006E ; MA # ( 𝕟 → n ) MATHEMATICAL DOUBLE-STRUCK SMALL N → LATIN SMALL LETTER N # +1D593 ; 006E ; MA # ( 𝖓 → n ) MATHEMATICAL BOLD FRAKTUR SMALL N → LATIN SMALL LETTER N # +1D5C7 ; 006E ; MA # ( 𝗇 → n ) MATHEMATICAL SANS-SERIF SMALL N → LATIN SMALL LETTER N # +1D5FB ; 006E ; MA # ( 𝗻 → n ) MATHEMATICAL SANS-SERIF BOLD SMALL N → LATIN SMALL LETTER N # +1D62F ; 006E ; MA # ( 𝘯 → n ) MATHEMATICAL SANS-SERIF ITALIC SMALL N → LATIN SMALL LETTER N # +1D663 ; 006E ; MA # ( 𝙣 → n ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL N → LATIN SMALL LETTER N # +1D697 ; 006E ; MA # ( 𝚗 → n ) MATHEMATICAL MONOSPACE SMALL N → LATIN SMALL LETTER N # +0578 ; 006E ; MA # ( ո → n ) ARMENIAN SMALL LETTER VO → LATIN SMALL LETTER N # +057C ; 006E ; MA # ( ռ → n ) ARMENIAN SMALL LETTER RA → LATIN SMALL LETTER N # + +FF2E ; 004E ; MA # ( N → N ) FULLWIDTH LATIN CAPITAL LETTER N → LATIN CAPITAL LETTER N # →Ν→ +2115 ; 004E ; MA # ( ℕ → N ) DOUBLE-STRUCK CAPITAL N → LATIN CAPITAL LETTER N # +1CCE3 ; 004E ; MA #* ( 𜳣 → N ) OUTLINED LATIN CAPITAL LETTER N → LATIN CAPITAL LETTER N # +1D40D ; 004E ; MA # ( 𝐍 → N ) MATHEMATICAL BOLD CAPITAL N → LATIN CAPITAL LETTER N # +1D441 ; 004E ; MA # ( 𝑁 → N ) MATHEMATICAL ITALIC CAPITAL N → LATIN CAPITAL LETTER N # +1D475 ; 004E ; MA # ( 𝑵 → N ) MATHEMATICAL BOLD ITALIC CAPITAL N → LATIN CAPITAL LETTER N # +1D4A9 ; 004E ; MA # ( 𝒩 → N ) MATHEMATICAL SCRIPT CAPITAL N → LATIN CAPITAL LETTER N # +1D4DD ; 004E ; MA # ( 𝓝 → N ) MATHEMATICAL BOLD SCRIPT CAPITAL N → LATIN CAPITAL LETTER N # +1D511 ; 004E ; MA # ( 𝔑 → N ) MATHEMATICAL FRAKTUR CAPITAL N → LATIN CAPITAL LETTER N # +1D579 ; 004E ; MA # ( 𝕹 → N ) MATHEMATICAL BOLD FRAKTUR CAPITAL N → LATIN CAPITAL LETTER N # +1D5AD ; 004E ; MA # ( 𝖭 → N ) MATHEMATICAL SANS-SERIF CAPITAL N → LATIN CAPITAL LETTER N # +1D5E1 ; 004E ; MA # ( 𝗡 → N ) MATHEMATICAL SANS-SERIF BOLD CAPITAL N → LATIN CAPITAL LETTER N # +1D615 ; 004E ; MA # ( 𝘕 → N ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL N → LATIN CAPITAL LETTER N # +1D649 ; 004E ; MA # ( 𝙉 → N ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL N → LATIN CAPITAL LETTER N # +1D67D ; 004E ; MA # ( 𝙽 → N ) MATHEMATICAL MONOSPACE CAPITAL N → LATIN CAPITAL LETTER N # +039D ; 004E ; MA # ( Ν → N ) GREEK CAPITAL LETTER NU → LATIN CAPITAL LETTER N # +1D6B4 ; 004E ; MA # ( 𝚴 → N ) MATHEMATICAL BOLD CAPITAL NU → LATIN CAPITAL LETTER N # →𝐍→ +1D6EE ; 004E ; MA # ( 𝛮 → N ) MATHEMATICAL ITALIC CAPITAL NU → LATIN CAPITAL LETTER N # →𝑁→ +1D728 ; 004E ; MA # ( 𝜨 → N ) MATHEMATICAL BOLD ITALIC CAPITAL NU → LATIN CAPITAL LETTER N # →𝑵→ +1D762 ; 004E ; MA # ( 𝝢 → N ) MATHEMATICAL SANS-SERIF BOLD CAPITAL NU → LATIN CAPITAL LETTER N # →Ν→ +1D79C ; 004E ; MA # ( 𝞜 → N ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL NU → LATIN CAPITAL LETTER N # →Ν→ +2C9A ; 004E ; MA # ( Ⲛ → N ) COPTIC CAPITAL LETTER NI → LATIN CAPITAL LETTER N # +A4E0 ; 004E ; MA # ( ꓠ → N ) LISU LETTER NA → LATIN CAPITAL LETTER N # +10513 ; 004E ; MA # ( 𐔓 → N ) ELBASAN LETTER NE → LATIN CAPITAL LETTER N # + +1018E ; 004E 030A ; MA #* ( 𐆎 → N̊ ) NOMISMA SIGN → LATIN CAPITAL LETTER N, COMBINING RING ABOVE # →Νͦ→ + +0273 ; 006E 0328 ; MA # ( ɳ → n̨ ) LATIN SMALL LETTER N WITH RETROFLEX HOOK → LATIN SMALL LETTER N, COMBINING OGONEK # →n̢→ + +019E ; 006E 0329 ; MA # ( ƞ → n̩ ) LATIN SMALL LETTER N WITH LONG RIGHT LEG → LATIN SMALL LETTER N, COMBINING VERTICAL LINE BELOW # +014B ; 006E 0329 ; MA # ( ŋ → n̩ ) LATIN SMALL LETTER ENG → LATIN SMALL LETTER N, COMBINING VERTICAL LINE BELOW # →η→→ƞ→ +03B7 ; 006E 0329 ; MA # ( η → n̩ ) GREEK SMALL LETTER ETA → LATIN SMALL LETTER N, COMBINING VERTICAL LINE BELOW # →ƞ→ +1D6C8 ; 006E 0329 ; MA # ( 𝛈 → n̩ ) MATHEMATICAL BOLD SMALL ETA → LATIN SMALL LETTER N, COMBINING VERTICAL LINE BELOW # →η→→ƞ→ +1D702 ; 006E 0329 ; MA # ( 𝜂 → n̩ ) MATHEMATICAL ITALIC SMALL ETA → LATIN SMALL LETTER N, COMBINING VERTICAL LINE BELOW # →η→→ƞ→ +1D73C ; 006E 0329 ; MA # ( 𝜼 → n̩ ) MATHEMATICAL BOLD ITALIC SMALL ETA → LATIN SMALL LETTER N, COMBINING VERTICAL LINE BELOW # →η→→ƞ→ +1D776 ; 006E 0329 ; MA # ( 𝝶 → n̩ ) MATHEMATICAL SANS-SERIF BOLD SMALL ETA → LATIN SMALL LETTER N, COMBINING VERTICAL LINE BELOW # →η→→ƞ→ +1D7B0 ; 006E 0329 ; MA # ( 𝞰 → n̩ ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL ETA → LATIN SMALL LETTER N, COMBINING VERTICAL LINE BELOW # →η→→ƞ→ +0572 ; 006E 0329 ; MA # ( ղ → n̩ ) ARMENIAN SMALL LETTER GHAD → LATIN SMALL LETTER N, COMBINING VERTICAL LINE BELOW # →η→→ƞ→ + +019D ; 004E 0326 ; MA # ( Ɲ → N̦ ) LATIN CAPITAL LETTER N WITH LEFT HOOK → LATIN CAPITAL LETTER N, COMBINING COMMA BELOW # →N̡→ + +1D70 ; 006E 0334 ; MA # ( ᵰ → n̴ ) LATIN SMALL LETTER N WITH MIDDLE TILDE → LATIN SMALL LETTER N, COMBINING TILDE OVERLAY # + +01CC ; 006E 006A ; MA # ( nj → nj ) LATIN SMALL LETTER NJ → LATIN SMALL LETTER N, LATIN SMALL LETTER J # + +01CB ; 004E 006A ; MA # ( Nj → Nj ) LATIN CAPITAL LETTER N WITH SMALL LETTER J → LATIN CAPITAL LETTER N, LATIN SMALL LETTER J # + +01CA ; 004E 004A ; MA # ( NJ → NJ ) LATIN CAPITAL LETTER NJ → LATIN CAPITAL LETTER N, LATIN CAPITAL LETTER J # + +2116 ; 004E 006F ; MA #* ( № → No ) NUMERO SIGN → LATIN CAPITAL LETTER N, LATIN SMALL LETTER O # + +2C9B ; 0274 ; MA # ( ⲛ → ɴ ) COPTIC SMALL LETTER NI → LATIN LETTER SMALL CAPITAL N # + +0377 ; 1D0E ; MA # ( ͷ → ᴎ ) GREEK SMALL LETTER PAMPHYLIAN DIGAMMA → LATIN LETTER SMALL CAPITAL REVERSED N # →и→ +0438 ; 1D0E ; MA # ( и → ᴎ ) CYRILLIC SMALL LETTER I → LATIN LETTER SMALL CAPITAL REVERSED N # +1044D ; 1D0E ; MA # ( 𐑍 → ᴎ ) DESERET SMALL LETTER ENG → LATIN LETTER SMALL CAPITAL REVERSED N # →и→ + +0146 ; 0272 ; MA # ( ņ → ɲ ) LATIN SMALL LETTER N WITH CEDILLA → LATIN SMALL LETTER N WITH LEFT HOOK # + +0C02 ; 006F ; MA # ( ం → o ) TELUGU SIGN ANUSVARA → LATIN SMALL LETTER O # +0C82 ; 006F ; MA # ( ಂ → o ) KANNADA SIGN ANUSVARA → LATIN SMALL LETTER O # +0D02 ; 006F ; MA # ( ം → o ) MALAYALAM SIGN ANUSVARA → LATIN SMALL LETTER O # +0D82 ; 006F ; MA # ( ං → o ) SINHALA SIGN ANUSVARAYA → LATIN SMALL LETTER O # +0966 ; 006F ; MA # ( ० → o ) DEVANAGARI DIGIT ZERO → LATIN SMALL LETTER O # +09E6 ; 006F ; MA # ( ০ → o ) BENGALI DIGIT ZERO → LATIN SMALL LETTER O # +0A66 ; 006F ; MA # ( ੦ → o ) GURMUKHI DIGIT ZERO → LATIN SMALL LETTER O # +0AE6 ; 006F ; MA # ( ૦ → o ) GUJARATI DIGIT ZERO → LATIN SMALL LETTER O # +0B66 ; 006F ; MA # ( ୦ → o ) ORIYA DIGIT ZERO → LATIN SMALL LETTER O # +0BE6 ; 006F ; MA # ( ௦ → o ) TAMIL DIGIT ZERO → LATIN SMALL LETTER O # +0C66 ; 006F ; MA # ( ౦ → o ) TELUGU DIGIT ZERO → LATIN SMALL LETTER O # +0D66 ; 006F ; MA # ( ൦ → o ) MALAYALAM DIGIT ZERO → LATIN SMALL LETTER O # +0E50 ; 006F ; MA # ( ๐ → o ) THAI DIGIT ZERO → LATIN SMALL LETTER O # +0ED0 ; 006F ; MA # ( ໐ → o ) LAO DIGIT ZERO → LATIN SMALL LETTER O # +1040 ; 006F ; MA # ( ၀ → o ) MYANMAR DIGIT ZERO → LATIN SMALL LETTER O # +17E0 ; 006F ; MA # ( ០ → o ) KHMER DIGIT ZERO → LATIN SMALL LETTER O # +114D0 ; 006F ; MA # ( 𑓐 → o ) TIRHUTA DIGIT ZERO → LATIN SMALL LETTER O # →০→ +0665 ; 006F ; MA # ( ‎٥‎ → o ) ARABIC-INDIC DIGIT FIVE → LATIN SMALL LETTER O # +06F5 ; 006F ; MA # ( ۵ → o ) EXTENDED ARABIC-INDIC DIGIT FIVE → LATIN SMALL LETTER O # →‎٥‎→ +FF4F ; 006F ; MA # ( o → o ) FULLWIDTH LATIN SMALL LETTER O → LATIN SMALL LETTER O # →о→ +2134 ; 006F ; MA # ( ℴ → o ) SCRIPT SMALL O → LATIN SMALL LETTER O # +1D428 ; 006F ; MA # ( 𝐨 → o ) MATHEMATICAL BOLD SMALL O → LATIN SMALL LETTER O # +1D45C ; 006F ; MA # ( 𝑜 → o ) MATHEMATICAL ITALIC SMALL O → LATIN SMALL LETTER O # +1D490 ; 006F ; MA # ( 𝒐 → o ) MATHEMATICAL BOLD ITALIC SMALL O → LATIN SMALL LETTER O # +1D4F8 ; 006F ; MA # ( 𝓸 → o ) MATHEMATICAL BOLD SCRIPT SMALL O → LATIN SMALL LETTER O # +1D52C ; 006F ; MA # ( 𝔬 → o ) MATHEMATICAL FRAKTUR SMALL O → LATIN SMALL LETTER O # +1D560 ; 006F ; MA # ( 𝕠 → o ) MATHEMATICAL DOUBLE-STRUCK SMALL O → LATIN SMALL LETTER O # +1D594 ; 006F ; MA # ( 𝖔 → o ) MATHEMATICAL BOLD FRAKTUR SMALL O → LATIN SMALL LETTER O # +1D5C8 ; 006F ; MA # ( 𝗈 → o ) MATHEMATICAL SANS-SERIF SMALL O → LATIN SMALL LETTER O # +1D5FC ; 006F ; MA # ( 𝗼 → o ) MATHEMATICAL SANS-SERIF BOLD SMALL O → LATIN SMALL LETTER O # +1D630 ; 006F ; MA # ( 𝘰 → o ) MATHEMATICAL SANS-SERIF ITALIC SMALL O → LATIN SMALL LETTER O # +1D664 ; 006F ; MA # ( 𝙤 → o ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL O → LATIN SMALL LETTER O # +1D698 ; 006F ; MA # ( 𝚘 → o ) MATHEMATICAL MONOSPACE SMALL O → LATIN SMALL LETTER O # +1D0F ; 006F ; MA # ( ᴏ → o ) LATIN LETTER SMALL CAPITAL O → LATIN SMALL LETTER O # +1D11 ; 006F ; MA # ( ᴑ → o ) LATIN SMALL LETTER SIDEWAYS O → LATIN SMALL LETTER O # +AB3D ; 006F ; MA # ( ꬽ → o ) LATIN SMALL LETTER BLACKLETTER O → LATIN SMALL LETTER O # +03BF ; 006F ; MA # ( ο → o ) GREEK SMALL LETTER OMICRON → LATIN SMALL LETTER O # +1D6D0 ; 006F ; MA # ( 𝛐 → o ) MATHEMATICAL BOLD SMALL OMICRON → LATIN SMALL LETTER O # →𝐨→ +1D70A ; 006F ; MA # ( 𝜊 → o ) MATHEMATICAL ITALIC SMALL OMICRON → LATIN SMALL LETTER O # →𝑜→ +1D744 ; 006F ; MA # ( 𝝄 → o ) MATHEMATICAL BOLD ITALIC SMALL OMICRON → LATIN SMALL LETTER O # →𝒐→ +1D77E ; 006F ; MA # ( 𝝾 → o ) MATHEMATICAL SANS-SERIF BOLD SMALL OMICRON → LATIN SMALL LETTER O # →ο→ +1D7B8 ; 006F ; MA # ( 𝞸 → o ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL OMICRON → LATIN SMALL LETTER O # →ο→ +03C3 ; 006F ; MA # ( σ → o ) GREEK SMALL LETTER SIGMA → LATIN SMALL LETTER O # +1D6D4 ; 006F ; MA # ( 𝛔 → o ) MATHEMATICAL BOLD SMALL SIGMA → LATIN SMALL LETTER O # →σ→ +1D70E ; 006F ; MA # ( 𝜎 → o ) MATHEMATICAL ITALIC SMALL SIGMA → LATIN SMALL LETTER O # →σ→ +1D748 ; 006F ; MA # ( 𝝈 → o ) MATHEMATICAL BOLD ITALIC SMALL SIGMA → LATIN SMALL LETTER O # →σ→ +1D782 ; 006F ; MA # ( 𝞂 → o ) MATHEMATICAL SANS-SERIF BOLD SMALL SIGMA → LATIN SMALL LETTER O # →σ→ +1D7BC ; 006F ; MA # ( 𝞼 → o ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL SIGMA → LATIN SMALL LETTER O # →σ→ +2C9F ; 006F ; MA # ( ⲟ → o ) COPTIC SMALL LETTER O → LATIN SMALL LETTER O # +03ED ; 006F ; MA # ( ϭ → o ) COPTIC SMALL LETTER SHIMA → LATIN SMALL LETTER O # →σ→ +043E ; 006F ; MA # ( о → o ) CYRILLIC SMALL LETTER O → LATIN SMALL LETTER O # +10FF ; 006F ; MA # ( ჿ → o ) GEORGIAN LETTER LABIAL SIGN → LATIN SMALL LETTER O # +0585 ; 006F ; MA # ( օ → o ) ARMENIAN SMALL LETTER OH → LATIN SMALL LETTER O # +05E1 ; 006F ; MA # ( ‎ס‎ → o ) HEBREW LETTER SAMEKH → LATIN SMALL LETTER O # +0647 ; 006F ; MA # ( ‎ه‎ → o ) ARABIC LETTER HEH → LATIN SMALL LETTER O # +1EE24 ; 006F ; MA # ( ‎𞸤‎ → o ) ARABIC MATHEMATICAL INITIAL HEH → LATIN SMALL LETTER O # →‎ه‎→ +1EE64 ; 006F ; MA # ( ‎𞹤‎ → o ) ARABIC MATHEMATICAL STRETCHED HEH → LATIN SMALL LETTER O # →‎ه‎→ +1EE84 ; 006F ; MA # ( ‎𞺄‎ → o ) ARABIC MATHEMATICAL LOOPED HEH → LATIN SMALL LETTER O # →‎ه‎→ +FEEB ; 006F ; MA # ( ‎ﻫ‎ → o ) ARABIC LETTER HEH INITIAL FORM → LATIN SMALL LETTER O # →‎ه‎→ +FEEC ; 006F ; MA # ( ‎ﻬ‎ → o ) ARABIC LETTER HEH MEDIAL FORM → LATIN SMALL LETTER O # →‎ه‎→ +FEEA ; 006F ; MA # ( ‎ﻪ‎ → o ) ARABIC LETTER HEH FINAL FORM → LATIN SMALL LETTER O # →‎ه‎→ +FEE9 ; 006F ; MA # ( ‎ﻩ‎ → o ) ARABIC LETTER HEH ISOLATED FORM → LATIN SMALL LETTER O # →‎ه‎→ +06BE ; 006F ; MA # ( ‎ھ‎ → o ) ARABIC LETTER HEH DOACHASHMEE → LATIN SMALL LETTER O # →‎ه‎→ +FBAC ; 006F ; MA # ( ‎ﮬ‎ → o ) ARABIC LETTER HEH DOACHASHMEE INITIAL FORM → LATIN SMALL LETTER O # →‎ﻫ‎→→‎ه‎→ +FBAD ; 006F ; MA # ( ‎ﮭ‎ → o ) ARABIC LETTER HEH DOACHASHMEE MEDIAL FORM → LATIN SMALL LETTER O # →‎ﻬ‎→→‎ه‎→ +FBAB ; 006F ; MA # ( ‎ﮫ‎ → o ) ARABIC LETTER HEH DOACHASHMEE FINAL FORM → LATIN SMALL LETTER O # →‎ﻪ‎→→‎ه‎→ +FBAA ; 006F ; MA # ( ‎ﮪ‎ → o ) ARABIC LETTER HEH DOACHASHMEE ISOLATED FORM → LATIN SMALL LETTER O # →‎ه‎→ +06C1 ; 006F ; MA # ( ‎ہ‎ → o ) ARABIC LETTER HEH GOAL → LATIN SMALL LETTER O # →‎ه‎→ +FBA8 ; 006F ; MA # ( ‎ﮨ‎ → o ) ARABIC LETTER HEH GOAL INITIAL FORM → LATIN SMALL LETTER O # →‎ہ‎→→‎ه‎→ +FBA9 ; 006F ; MA # ( ‎ﮩ‎ → o ) ARABIC LETTER HEH GOAL MEDIAL FORM → LATIN SMALL LETTER O # →‎ہ‎→→‎ه‎→ +FBA7 ; 006F ; MA # ( ‎ﮧ‎ → o ) ARABIC LETTER HEH GOAL FINAL FORM → LATIN SMALL LETTER O # →‎ہ‎→→‎ه‎→ +FBA6 ; 006F ; MA # ( ‎ﮦ‎ → o ) ARABIC LETTER HEH GOAL ISOLATED FORM → LATIN SMALL LETTER O # →‎ه‎→ +06D5 ; 006F ; MA # ( ‎ە‎ → o ) ARABIC LETTER AE → LATIN SMALL LETTER O # →‎ه‎→ +0D20 ; 006F ; MA # ( ഠ → o ) MALAYALAM LETTER TTHA → LATIN SMALL LETTER O # +101D ; 006F ; MA # ( ဝ → o ) MYANMAR LETTER WA → LATIN SMALL LETTER O # +104EA ; 006F ; MA # ( 𐓪 → o ) OSAGE SMALL LETTER O → LATIN SMALL LETTER O # +118C8 ; 006F ; MA # ( 𑣈 → o ) WARANG CITI SMALL LETTER E → LATIN SMALL LETTER O # +118D7 ; 006F ; MA # ( 𑣗 → o ) WARANG CITI SMALL LETTER BU → LATIN SMALL LETTER O # +1042C ; 006F ; MA # ( 𐐬 → o ) DESERET SMALL LETTER LONG O → LATIN SMALL LETTER O # + +0030 ; 004F ; MA # ( 0 → O ) DIGIT ZERO → LATIN CAPITAL LETTER O # +07C0 ; 004F ; MA # ( ‎߀‎ → O ) NKO DIGIT ZERO → LATIN CAPITAL LETTER O # →0→ +0CE6 ; 004F ; MA # ( ೦ → O ) KANNADA DIGIT ZERO → LATIN CAPITAL LETTER O # +3007 ; 004F ; MA # ( 〇 → O ) IDEOGRAPHIC NUMBER ZERO → LATIN CAPITAL LETTER O # +118E0 ; 004F ; MA # ( 𑣠 → O ) WARANG CITI DIGIT ZERO → LATIN CAPITAL LETTER O # →0→ +1CCF0 ; 004F ; MA # ( 𜳰 → O ) OUTLINED DIGIT ZERO → LATIN CAPITAL LETTER O # →0→ +1D7CE ; 004F ; MA # ( 𝟎 → O ) MATHEMATICAL BOLD DIGIT ZERO → LATIN CAPITAL LETTER O # →0→ +1D7D8 ; 004F ; MA # ( 𝟘 → O ) MATHEMATICAL DOUBLE-STRUCK DIGIT ZERO → LATIN CAPITAL LETTER O # →0→ +1D7E2 ; 004F ; MA # ( 𝟢 → O ) MATHEMATICAL SANS-SERIF DIGIT ZERO → LATIN CAPITAL LETTER O # →0→ +1D7EC ; 004F ; MA # ( 𝟬 → O ) MATHEMATICAL SANS-SERIF BOLD DIGIT ZERO → LATIN CAPITAL LETTER O # →0→ +1D7F6 ; 004F ; MA # ( 𝟶 → O ) MATHEMATICAL MONOSPACE DIGIT ZERO → LATIN CAPITAL LETTER O # →0→ +1FBF0 ; 004F ; MA # ( 🯰 → O ) SEGMENTED DIGIT ZERO → LATIN CAPITAL LETTER O # →0→ +FF2F ; 004F ; MA # ( O → O ) FULLWIDTH LATIN CAPITAL LETTER O → LATIN CAPITAL LETTER O # →О→ +1CCE4 ; 004F ; MA #* ( 𜳤 → O ) OUTLINED LATIN CAPITAL LETTER O → LATIN CAPITAL LETTER O # +1D40E ; 004F ; MA # ( 𝐎 → O ) MATHEMATICAL BOLD CAPITAL O → LATIN CAPITAL LETTER O # +1D442 ; 004F ; MA # ( 𝑂 → O ) MATHEMATICAL ITALIC CAPITAL O → LATIN CAPITAL LETTER O # +1D476 ; 004F ; MA # ( 𝑶 → O ) MATHEMATICAL BOLD ITALIC CAPITAL O → LATIN CAPITAL LETTER O # +1D4AA ; 004F ; MA # ( 𝒪 → O ) MATHEMATICAL SCRIPT CAPITAL O → LATIN CAPITAL LETTER O # +1D4DE ; 004F ; MA # ( 𝓞 → O ) MATHEMATICAL BOLD SCRIPT CAPITAL O → LATIN CAPITAL LETTER O # +1D512 ; 004F ; MA # ( 𝔒 → O ) MATHEMATICAL FRAKTUR CAPITAL O → LATIN CAPITAL LETTER O # +1D546 ; 004F ; MA # ( 𝕆 → O ) MATHEMATICAL DOUBLE-STRUCK CAPITAL O → LATIN CAPITAL LETTER O # +1D57A ; 004F ; MA # ( 𝕺 → O ) MATHEMATICAL BOLD FRAKTUR CAPITAL O → LATIN CAPITAL LETTER O # +1D5AE ; 004F ; MA # ( 𝖮 → O ) MATHEMATICAL SANS-SERIF CAPITAL O → LATIN CAPITAL LETTER O # +1D5E2 ; 004F ; MA # ( 𝗢 → O ) MATHEMATICAL SANS-SERIF BOLD CAPITAL O → LATIN CAPITAL LETTER O # +1D616 ; 004F ; MA # ( 𝘖 → O ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL O → LATIN CAPITAL LETTER O # +1D64A ; 004F ; MA # ( 𝙊 → O ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL O → LATIN CAPITAL LETTER O # +1D67E ; 004F ; MA # ( 𝙾 → O ) MATHEMATICAL MONOSPACE CAPITAL O → LATIN CAPITAL LETTER O # +039F ; 004F ; MA # ( Ο → O ) GREEK CAPITAL LETTER OMICRON → LATIN CAPITAL LETTER O # +1D6B6 ; 004F ; MA # ( 𝚶 → O ) MATHEMATICAL BOLD CAPITAL OMICRON → LATIN CAPITAL LETTER O # →𝐎→ +1D6F0 ; 004F ; MA # ( 𝛰 → O ) MATHEMATICAL ITALIC CAPITAL OMICRON → LATIN CAPITAL LETTER O # →𝑂→ +1D72A ; 004F ; MA # ( 𝜪 → O ) MATHEMATICAL BOLD ITALIC CAPITAL OMICRON → LATIN CAPITAL LETTER O # →𝑶→ +1D764 ; 004F ; MA # ( 𝝤 → O ) MATHEMATICAL SANS-SERIF BOLD CAPITAL OMICRON → LATIN CAPITAL LETTER O # →Ο→ +1D79E ; 004F ; MA # ( 𝞞 → O ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL OMICRON → LATIN CAPITAL LETTER O # →Ο→ +2C9E ; 004F ; MA # ( Ⲟ → O ) COPTIC CAPITAL LETTER O → LATIN CAPITAL LETTER O # +041E ; 004F ; MA # ( О → O ) CYRILLIC CAPITAL LETTER O → LATIN CAPITAL LETTER O # +0555 ; 004F ; MA # ( Օ → O ) ARMENIAN CAPITAL LETTER OH → LATIN CAPITAL LETTER O # +2D54 ; 004F ; MA # ( ⵔ → O ) TIFINAGH LETTER YAR → LATIN CAPITAL LETTER O # +12D0 ; 004F ; MA # ( ዐ → O ) ETHIOPIC SYLLABLE PHARYNGEAL A → LATIN CAPITAL LETTER O # →Օ→ +0B20 ; 004F ; MA # ( ଠ → O ) ORIYA LETTER TTHA → LATIN CAPITAL LETTER O # +104C2 ; 004F ; MA # ( 𐓂 → O ) OSAGE CAPITAL LETTER O → LATIN CAPITAL LETTER O # +A4F3 ; 004F ; MA # ( ꓳ → O ) LISU LETTER O → LATIN CAPITAL LETTER O # +118B5 ; 004F ; MA # ( 𑢵 → O ) WARANG CITI CAPITAL LETTER AT → LATIN CAPITAL LETTER O # +10292 ; 004F ; MA # ( 𐊒 → O ) LYCIAN LETTER U → LATIN CAPITAL LETTER O # +102AB ; 004F ; MA # ( 𐊫 → O ) CARIAN LETTER O → LATIN CAPITAL LETTER O # +10404 ; 004F ; MA # ( 𐐄 → O ) DESERET CAPITAL LETTER LONG O → LATIN CAPITAL LETTER O # +10516 ; 004F ; MA # ( 𐔖 → O ) ELBASAN LETTER O → LATIN CAPITAL LETTER O # +11DE0 ; 004F ; MA # ( 𑷠 → O ) TOLONG SIKI DIGIT ZERO → LATIN CAPITAL LETTER O # →0→ + +2070 ; 00BA ; MA #* ( ⁰ → º ) SUPERSCRIPT ZERO → MASCULINE ORDINAL INDICATOR # +1D52 ; 00BA ; MA # ( ᵒ → º ) MODIFIER LETTER SMALL O → MASCULINE ORDINAL INDICATOR # →⁰→ + +01D2 ; 014F ; MA # ( ǒ → ŏ ) LATIN SMALL LETTER O WITH CARON → LATIN SMALL LETTER O WITH BREVE # + +01D1 ; 014E ; MA # ( Ǒ → Ŏ ) LATIN CAPITAL LETTER O WITH CARON → LATIN CAPITAL LETTER O WITH BREVE # + +06FF ; 006F 0302 ; MA # ( ‎ۿ‎ → ô ) ARABIC LETTER HEH WITH INVERTED V → LATIN SMALL LETTER O, COMBINING CIRCUMFLEX ACCENT # →‎ھٛ‎→ + +0150 ; 00D6 ; MA # ( Ő → Ö ) LATIN CAPITAL LETTER O WITH DOUBLE ACUTE → LATIN CAPITAL LETTER O WITH DIAERESIS # + +00F8 ; 006F 0338 ; MA # ( ø → o̸ ) LATIN SMALL LETTER O WITH STROKE → LATIN SMALL LETTER O, COMBINING LONG SOLIDUS OVERLAY # →o̷→ +AB3E ; 006F 0338 ; MA # ( ꬾ → o̸ ) LATIN SMALL LETTER BLACKLETTER O WITH STROKE → LATIN SMALL LETTER O, COMBINING LONG SOLIDUS OVERLAY # →ø→→o̷→ + +00D8 ; 004F 0338 ; MA # ( Ø → O̸ ) LATIN CAPITAL LETTER O WITH STROKE → LATIN CAPITAL LETTER O, COMBINING LONG SOLIDUS OVERLAY # +2D41 ; 004F 0338 ; MA # ( ⵁ → O̸ ) TIFINAGH LETTER BERBER ACADEMY YAH → LATIN CAPITAL LETTER O, COMBINING LONG SOLIDUS OVERLAY # →Ø→ + +01FE ; 004F 0338 0301 ; MA # ( Ǿ → Ó̸ ) LATIN CAPITAL LETTER O WITH STROKE AND ACUTE → LATIN CAPITAL LETTER O, COMBINING LONG SOLIDUS OVERLAY, COMBINING ACUTE ACCENT # + +0275 ; 006F 0335 ; MA # ( ɵ → o̵ ) LATIN SMALL LETTER BARRED O → LATIN SMALL LETTER O, COMBINING SHORT STROKE OVERLAY # +A74B ; 006F 0335 ; MA # ( ꝋ → o̵ ) LATIN SMALL LETTER O WITH LONG STROKE OVERLAY → LATIN SMALL LETTER O, COMBINING SHORT STROKE OVERLAY # →o̶→ +2C91 ; 006F 0335 ; MA # ( ⲑ → o̵ ) COPTIC SMALL LETTER THETHE → LATIN SMALL LETTER O, COMBINING SHORT STROKE OVERLAY # →ɵ→ +04E9 ; 006F 0335 ; MA # ( ө → o̵ ) CYRILLIC SMALL LETTER BARRED O → LATIN SMALL LETTER O, COMBINING SHORT STROKE OVERLAY # →ѳ→ +0473 ; 006F 0335 ; MA # ( ѳ → o̵ ) CYRILLIC SMALL LETTER FITA → LATIN SMALL LETTER O, COMBINING SHORT STROKE OVERLAY # +AB8E ; 006F 0335 ; MA # ( ꮎ → o̵ ) CHEROKEE SMALL LETTER NA → LATIN SMALL LETTER O, COMBINING SHORT STROKE OVERLAY # →ɵ→ +ABBB ; 006F 0335 ; MA # ( ꮻ → o̵ ) CHEROKEE SMALL LETTER WI → LATIN SMALL LETTER O, COMBINING SHORT STROKE OVERLAY # →ѳ→ + +2296 ; 004F 0335 ; MA #* ( ⊖ → O̵ ) CIRCLED MINUS → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →θ→→Ꮎ→ +229D ; 004F 0335 ; MA #* ( ⊝ → O̵ ) CIRCLED DASH → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →⊖→→θ→→Ꮎ→ +236C ; 004F 0335 ; MA #* ( ⍬ → O̵ ) APL FUNCTIONAL SYMBOL ZILDE → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →θ→→Ꮎ→ +1D21A ; 004F 0335 ; MA #* ( 𝈚 → O̵ ) GREEK VOCAL NOTATION SYMBOL-52 → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Ꝋ→→O̶→ +1F714 ; 004F 0335 ; MA #* ( 🜔 → O̵ ) ALCHEMICAL SYMBOL FOR SALT → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Ɵ→→O̶→ +019F ; 004F 0335 ; MA # ( Ɵ → O̵ ) LATIN CAPITAL LETTER O WITH MIDDLE TILDE → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →O̶→ +A74A ; 004F 0335 ; MA # ( Ꝋ → O̵ ) LATIN CAPITAL LETTER O WITH LONG STROKE OVERLAY → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →O̶→ +03B8 ; 004F 0335 ; MA # ( θ → O̵ ) GREEK SMALL LETTER THETA → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Ꮎ→ +03D1 ; 004F 0335 ; MA # ( ϑ → O̵ ) GREEK THETA SYMBOL → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →θ→→Ꮎ→ +1D6C9 ; 004F 0335 ; MA # ( 𝛉 → O̵ ) MATHEMATICAL BOLD SMALL THETA → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →θ→→Ꮎ→ +1D6DD ; 004F 0335 ; MA # ( 𝛝 → O̵ ) MATHEMATICAL BOLD THETA SYMBOL → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →θ→→Ꮎ→ +1D703 ; 004F 0335 ; MA # ( 𝜃 → O̵ ) MATHEMATICAL ITALIC SMALL THETA → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →θ→→Ꮎ→ +1D717 ; 004F 0335 ; MA # ( 𝜗 → O̵ ) MATHEMATICAL ITALIC THETA SYMBOL → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →θ→→Ꮎ→ +1D73D ; 004F 0335 ; MA # ( 𝜽 → O̵ ) MATHEMATICAL BOLD ITALIC SMALL THETA → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →θ→→Ꮎ→ +1D751 ; 004F 0335 ; MA # ( 𝝑 → O̵ ) MATHEMATICAL BOLD ITALIC THETA SYMBOL → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →θ→→Ꮎ→ +1D777 ; 004F 0335 ; MA # ( 𝝷 → O̵ ) MATHEMATICAL SANS-SERIF BOLD SMALL THETA → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →θ→→Ꮎ→ +1D78B ; 004F 0335 ; MA # ( 𝞋 → O̵ ) MATHEMATICAL SANS-SERIF BOLD THETA SYMBOL → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →θ→→Ꮎ→ +1D7B1 ; 004F 0335 ; MA # ( 𝞱 → O̵ ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL THETA → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →θ→→Ꮎ→ +1D7C5 ; 004F 0335 ; MA # ( 𝟅 → O̵ ) MATHEMATICAL SANS-SERIF BOLD ITALIC THETA SYMBOL → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →θ→→Ꮎ→ +0398 ; 004F 0335 ; MA # ( Θ → O̵ ) GREEK CAPITAL LETTER THETA → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Ꮎ→ +03F4 ; 004F 0335 ; MA # ( ϴ → O̵ ) GREEK CAPITAL THETA SYMBOL → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Ѳ→→О̵→ +1D6AF ; 004F 0335 ; MA # ( 𝚯 → O̵ ) MATHEMATICAL BOLD CAPITAL THETA → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Θ→→Ꮎ→ +1D6B9 ; 004F 0335 ; MA # ( 𝚹 → O̵ ) MATHEMATICAL BOLD CAPITAL THETA SYMBOL → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Θ→→Ꮎ→ +1D6E9 ; 004F 0335 ; MA # ( 𝛩 → O̵ ) MATHEMATICAL ITALIC CAPITAL THETA → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Θ→→Ꮎ→ +1D6F3 ; 004F 0335 ; MA # ( 𝛳 → O̵ ) MATHEMATICAL ITALIC CAPITAL THETA SYMBOL → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Θ→→Ꮎ→ +1D723 ; 004F 0335 ; MA # ( 𝜣 → O̵ ) MATHEMATICAL BOLD ITALIC CAPITAL THETA → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Θ→→Ꮎ→ +1D72D ; 004F 0335 ; MA # ( 𝜭 → O̵ ) MATHEMATICAL BOLD ITALIC CAPITAL THETA SYMBOL → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Θ→→Ꮎ→ +1D75D ; 004F 0335 ; MA # ( 𝝝 → O̵ ) MATHEMATICAL SANS-SERIF BOLD CAPITAL THETA → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Θ→→Ꮎ→ +1D767 ; 004F 0335 ; MA # ( 𝝧 → O̵ ) MATHEMATICAL SANS-SERIF BOLD CAPITAL THETA SYMBOL → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Θ→→Ꮎ→ +1D797 ; 004F 0335 ; MA # ( 𝞗 → O̵ ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL THETA → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Θ→→Ꮎ→ +1D7A1 ; 004F 0335 ; MA # ( 𝞡 → O̵ ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL THETA SYMBOL → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Θ→→Ꮎ→ +2C90 ; 004F 0335 ; MA # ( Ⲑ → O̵ ) COPTIC CAPITAL LETTER THETHE → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →ϴ→→Ѳ→→О̵→ +04E8 ; 004F 0335 ; MA # ( Ө → O̵ ) CYRILLIC CAPITAL LETTER BARRED O → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Ѳ→→О̵→ +0472 ; 004F 0335 ; MA # ( Ѳ → O̵ ) CYRILLIC CAPITAL LETTER FITA → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →О̵→ +2D31 ; 004F 0335 ; MA # ( ⴱ → O̵ ) TIFINAGH LETTER YAB → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Ɵ→→O̶→ +13BE ; 004F 0335 ; MA # ( Ꮎ → O̵ ) CHEROKEE LETTER NA → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # +13EB ; 004F 0335 ; MA # ( Ꮻ → O̵ ) CHEROKEE LETTER WI → LATIN CAPITAL LETTER O, COMBINING SHORT STROKE OVERLAY # →Ѳ→→О̵→ + +AB74 ; 006F 031B ; MA # ( ꭴ → ơ ) CHEROKEE SMALL LETTER U → LATIN SMALL LETTER O, COMBINING HORN # + +FCD9 ; 006F 0670 ; MA # ( ‎ﳙ‎ → oٰ ) ARABIC LIGATURE HEH WITH SUPERSCRIPT ALEF INITIAL FORM → LATIN SMALL LETTER O, ARABIC LETTER SUPERSCRIPT ALEF # →‎هٰ‎→ + +1F101 ; 004F 002C ; MA #* ( 🄁 → O, ) DIGIT ZERO COMMA → LATIN CAPITAL LETTER O, COMMA # →0,→ + +1F100 ; 004F 002E ; MA #* ( 🄀 → O. ) DIGIT ZERO FULL STOP → LATIN CAPITAL LETTER O, FULL STOP # →0.→ + +01A1 ; 006F 0027 ; MA # ( ơ → o' ) LATIN SMALL LETTER O WITH HORN → LATIN SMALL LETTER O, APOSTROPHE # →oʼ→ + +01A0 ; 004F 0027 ; MA # ( Ơ → O' ) LATIN CAPITAL LETTER O WITH HORN → LATIN CAPITAL LETTER O, APOSTROPHE # →Oʼ→ +13A4 ; 004F 0027 ; MA # ( Ꭴ → O' ) CHEROKEE LETTER U → LATIN CAPITAL LETTER O, APOSTROPHE # →Ơ→→Oʼ→ + +0025 ; 00BA 002F 2080 ; MA #* ( % → º/₀ ) PERCENT SIGN → MASCULINE ORDINAL INDICATOR, SOLIDUS, SUBSCRIPT ZERO # →⁰/₀→ +066A ; 00BA 002F 2080 ; MA #* ( ٪ → º/₀ ) ARABIC PERCENT SIGN → MASCULINE ORDINAL INDICATOR, SOLIDUS, SUBSCRIPT ZERO # →%→→⁰/₀→ +2052 ; 00BA 002F 2080 ; MA #* ( ⁒ → º/₀ ) COMMERCIAL MINUS SIGN → MASCULINE ORDINAL INDICATOR, SOLIDUS, SUBSCRIPT ZERO # →%→→⁰/₀→ + +2030 ; 00BA 002F 2080 2080 ; MA #* ( ‰ → º/₀₀ ) PER MILLE SIGN → MASCULINE ORDINAL INDICATOR, SOLIDUS, SUBSCRIPT ZERO, SUBSCRIPT ZERO # →⁰/₀₀→ +0609 ; 00BA 002F 2080 2080 ; MA #* ( ؉ → º/₀₀ ) ARABIC-INDIC PER MILLE SIGN → MASCULINE ORDINAL INDICATOR, SOLIDUS, SUBSCRIPT ZERO, SUBSCRIPT ZERO # →‰→→⁰/₀₀→ + +2031 ; 00BA 002F 2080 2080 2080 ; MA #* ( ‱ → º/₀₀₀ ) PER TEN THOUSAND SIGN → MASCULINE ORDINAL INDICATOR, SOLIDUS, SUBSCRIPT ZERO, SUBSCRIPT ZERO, SUBSCRIPT ZERO # →⁰/₀₀₀→ +060A ; 00BA 002F 2080 2080 2080 ; MA #* ( ؊ → º/₀₀₀ ) ARABIC-INDIC PER TEN THOUSAND SIGN → MASCULINE ORDINAL INDICATOR, SOLIDUS, SUBSCRIPT ZERO, SUBSCRIPT ZERO, SUBSCRIPT ZERO # →‱→→⁰/₀₀₀→ + +0153 ; 006F 0065 ; MA # ( œ → oe ) LATIN SMALL LIGATURE OE → LATIN SMALL LETTER O, LATIN SMALL LETTER E # + +0152 ; 004F 0045 ; MA # ( Œ → OE ) LATIN CAPITAL LIGATURE OE → LATIN CAPITAL LETTER O, LATIN CAPITAL LETTER E # + +0276 ; 006F 1D07 ; MA # ( ɶ → oᴇ ) LATIN LETTER SMALL CAPITAL OE → LATIN SMALL LETTER O, LATIN LETTER SMALL CAPITAL E # + +221E ; 006F 006F ; MA #* ( ∞ → oo ) INFINITY → LATIN SMALL LETTER O, LATIN SMALL LETTER O # →ꝏ→ +A74F ; 006F 006F ; MA # ( ꝏ → oo ) LATIN SMALL LETTER OO → LATIN SMALL LETTER O, LATIN SMALL LETTER O # +A699 ; 006F 006F ; MA # ( ꚙ → oo ) CYRILLIC SMALL LETTER DOUBLE O → LATIN SMALL LETTER O, LATIN SMALL LETTER O # + +A74E ; 004F 004F ; MA # ( Ꝏ → OO ) LATIN CAPITAL LETTER OO → LATIN CAPITAL LETTER O, LATIN CAPITAL LETTER O # +A698 ; 004F 004F ; MA # ( Ꚙ → OO ) CYRILLIC CAPITAL LETTER DOUBLE O → LATIN CAPITAL LETTER O, LATIN CAPITAL LETTER O # + +FCD7 ; 006F 062C ; MA # ( ‎ﳗ‎ → ‎oج‎ ) ARABIC LIGATURE HEH WITH JEEM INITIAL FORM → LATIN SMALL LETTER O, ARABIC LETTER JEEM # →‎هج‎→ +FC51 ; 006F 062C ; MA # ( ‎ﱑ‎ → ‎oج‎ ) ARABIC LIGATURE HEH WITH JEEM ISOLATED FORM → LATIN SMALL LETTER O, ARABIC LETTER JEEM # →‎هج‎→ + +FCD8 ; 006F 0645 ; MA # ( ‎ﳘ‎ → ‎oم‎ ) ARABIC LIGATURE HEH WITH MEEM INITIAL FORM → LATIN SMALL LETTER O, ARABIC LETTER MEEM # →‎هم‎→ +FC52 ; 006F 0645 ; MA # ( ‎ﱒ‎ → ‎oم‎ ) ARABIC LIGATURE HEH WITH MEEM ISOLATED FORM → LATIN SMALL LETTER O, ARABIC LETTER MEEM # →‎هم‎→ + +FD93 ; 006F 0645 062C ; MA # ( ‎ﶓ‎ → ‎oمج‎ ) ARABIC LIGATURE HEH WITH MEEM WITH JEEM INITIAL FORM → LATIN SMALL LETTER O, ARABIC LETTER MEEM, ARABIC LETTER JEEM # →‎همج‎→ + +FD94 ; 006F 0645 0645 ; MA # ( ‎ﶔ‎ → ‎oمم‎ ) ARABIC LIGATURE HEH WITH MEEM WITH MEEM INITIAL FORM → LATIN SMALL LETTER O, ARABIC LETTER MEEM, ARABIC LETTER MEEM # →‎همم‎→ + +FC53 ; 006F 0649 ; MA # ( ‎ﱓ‎ → ‎oى‎ ) ARABIC LIGATURE HEH WITH ALEF MAKSURA ISOLATED FORM → LATIN SMALL LETTER O, ARABIC LETTER ALEF MAKSURA # →‎هى‎→ +FC54 ; 006F 0649 ; MA # ( ‎ﱔ‎ → ‎oى‎ ) ARABIC LIGATURE HEH WITH YEH ISOLATED FORM → LATIN SMALL LETTER O, ARABIC LETTER ALEF MAKSURA # →‎هي‎→ + +0D5F ; 006F 0D30 006F ; MA # ( ൟ → oരo ) MALAYALAM LETTER ARCHAIC II → LATIN SMALL LETTER O, MALAYALAM LETTER RA, LATIN SMALL LETTER O # →ംരം→ + +10D7 ; 006F 102C ; MA # ( თ → oာ ) GEORGIAN LETTER TAN → LATIN SMALL LETTER O, MYANMAR VOWEL SIGN AA # →တ→→ဝာ→ +1010 ; 006F 102C ; MA # ( တ → oာ ) MYANMAR LETTER TA → LATIN SMALL LETTER O, MYANMAR VOWEL SIGN AA # →ဝာ→ + +3358 ; 004F 70B9 ; MA #* ( ㍘ → O点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR ZERO → LATIN CAPITAL LETTER O, CJK UNIFIED IDEOGRAPH-70B9 # →0点→ + +2184 ; 0254 ; MA # ( ↄ → ɔ ) LATIN SMALL LETTER REVERSED C → LATIN SMALL LETTER OPEN O # +1D10 ; 0254 ; MA # ( ᴐ → ɔ ) LATIN LETTER SMALL CAPITAL OPEN O → LATIN SMALL LETTER OPEN O # +037B ; 0254 ; MA # ( ͻ → ɔ ) GREEK SMALL REVERSED LUNATE SIGMA SYMBOL → LATIN SMALL LETTER OPEN O # +1044B ; 0254 ; MA # ( 𐑋 → ɔ ) DESERET SMALL LETTER EM → LATIN SMALL LETTER OPEN O # + +2183 ; 0186 ; MA # ( Ↄ → Ɔ ) ROMAN NUMERAL REVERSED ONE HUNDRED → LATIN CAPITAL LETTER OPEN O # +03FD ; 0186 ; MA # ( Ͻ → Ɔ ) GREEK CAPITAL REVERSED LUNATE SIGMA SYMBOL → LATIN CAPITAL LETTER OPEN O # +A4DB ; 0186 ; MA # ( ꓛ → Ɔ ) LISU LETTER CHA → LATIN CAPITAL LETTER OPEN O # +10423 ; 0186 ; MA # ( 𐐣 → Ɔ ) DESERET CAPITAL LETTER EM → LATIN CAPITAL LETTER OPEN O # + +AB3F ; 0254 0338 ; MA # ( ꬿ → ɔ̸ ) LATIN SMALL LETTER OPEN O WITH STROKE → LATIN SMALL LETTER OPEN O, COMBINING LONG SOLIDUS OVERLAY # + +AB62 ; 0254 0065 ; MA # ( ꭢ → ɔe ) LATIN SMALL LETTER OPEN OE → LATIN SMALL LETTER OPEN O, LATIN SMALL LETTER E # + +1043F ; 0277 ; MA # ( 𐐿 → ɷ ) DESERET SMALL LETTER KAY → LATIN SMALL LETTER CLOSED OMEGA # + +2374 ; 0070 ; MA #* ( ⍴ → p ) APL FUNCTIONAL SYMBOL RHO → LATIN SMALL LETTER P # →ρ→ +FF50 ; 0070 ; MA # ( p → p ) FULLWIDTH LATIN SMALL LETTER P → LATIN SMALL LETTER P # →р→ +1D429 ; 0070 ; MA # ( 𝐩 → p ) MATHEMATICAL BOLD SMALL P → LATIN SMALL LETTER P # +1D45D ; 0070 ; MA # ( 𝑝 → p ) MATHEMATICAL ITALIC SMALL P → LATIN SMALL LETTER P # +1D491 ; 0070 ; MA # ( 𝒑 → p ) MATHEMATICAL BOLD ITALIC SMALL P → LATIN SMALL LETTER P # +1D4C5 ; 0070 ; MA # ( 𝓅 → p ) MATHEMATICAL SCRIPT SMALL P → LATIN SMALL LETTER P # +1D4F9 ; 0070 ; MA # ( 𝓹 → p ) MATHEMATICAL BOLD SCRIPT SMALL P → LATIN SMALL LETTER P # +1D52D ; 0070 ; MA # ( 𝔭 → p ) MATHEMATICAL FRAKTUR SMALL P → LATIN SMALL LETTER P # +1D561 ; 0070 ; MA # ( 𝕡 → p ) MATHEMATICAL DOUBLE-STRUCK SMALL P → LATIN SMALL LETTER P # +1D595 ; 0070 ; MA # ( 𝖕 → p ) MATHEMATICAL BOLD FRAKTUR SMALL P → LATIN SMALL LETTER P # +1D5C9 ; 0070 ; MA # ( 𝗉 → p ) MATHEMATICAL SANS-SERIF SMALL P → LATIN SMALL LETTER P # +1D5FD ; 0070 ; MA # ( 𝗽 → p ) MATHEMATICAL SANS-SERIF BOLD SMALL P → LATIN SMALL LETTER P # +1D631 ; 0070 ; MA # ( 𝘱 → p ) MATHEMATICAL SANS-SERIF ITALIC SMALL P → LATIN SMALL LETTER P # +1D665 ; 0070 ; MA # ( 𝙥 → p ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL P → LATIN SMALL LETTER P # +1D699 ; 0070 ; MA # ( 𝚙 → p ) MATHEMATICAL MONOSPACE SMALL P → LATIN SMALL LETTER P # +00FE ; 0070 ; MA # ( þ → p ) LATIN SMALL LETTER THORN → LATIN SMALL LETTER P # →ƿ→ +01BF ; 0070 ; MA # ( ƿ → p ) LATIN LETTER WYNN → LATIN SMALL LETTER P # +03C1 ; 0070 ; MA # ( ρ → p ) GREEK SMALL LETTER RHO → LATIN SMALL LETTER P # +03F1 ; 0070 ; MA # ( ϱ → p ) GREEK RHO SYMBOL → LATIN SMALL LETTER P # →ρ→ +1D6D2 ; 0070 ; MA # ( 𝛒 → p ) MATHEMATICAL BOLD SMALL RHO → LATIN SMALL LETTER P # →ρ→ +1D6E0 ; 0070 ; MA # ( 𝛠 → p ) MATHEMATICAL BOLD RHO SYMBOL → LATIN SMALL LETTER P # →ρ→ +1D70C ; 0070 ; MA # ( 𝜌 → p ) MATHEMATICAL ITALIC SMALL RHO → LATIN SMALL LETTER P # →ρ→ +1D71A ; 0070 ; MA # ( 𝜚 → p ) MATHEMATICAL ITALIC RHO SYMBOL → LATIN SMALL LETTER P # →ρ→ +1D746 ; 0070 ; MA # ( 𝝆 → p ) MATHEMATICAL BOLD ITALIC SMALL RHO → LATIN SMALL LETTER P # →ρ→ +1D754 ; 0070 ; MA # ( 𝝔 → p ) MATHEMATICAL BOLD ITALIC RHO SYMBOL → LATIN SMALL LETTER P # →ρ→ +1D780 ; 0070 ; MA # ( 𝞀 → p ) MATHEMATICAL SANS-SERIF BOLD SMALL RHO → LATIN SMALL LETTER P # →ρ→ +1D78E ; 0070 ; MA # ( 𝞎 → p ) MATHEMATICAL SANS-SERIF BOLD RHO SYMBOL → LATIN SMALL LETTER P # →ρ→ +1D7BA ; 0070 ; MA # ( 𝞺 → p ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL RHO → LATIN SMALL LETTER P # →ρ→ +1D7C8 ; 0070 ; MA # ( 𝟈 → p ) MATHEMATICAL SANS-SERIF BOLD ITALIC RHO SYMBOL → LATIN SMALL LETTER P # →ρ→ +03F8 ; 0070 ; MA # ( ϸ → p ) GREEK SMALL LETTER SHO → LATIN SMALL LETTER P # →þ→→ƿ→ +2CA3 ; 0070 ; MA # ( ⲣ → p ) COPTIC SMALL LETTER RO → LATIN SMALL LETTER P # →ρ→ +2CCF ; 0070 ; MA # ( ⳏ → p ) COPTIC SMALL LETTER OLD COPTIC HA → LATIN SMALL LETTER P # +0440 ; 0070 ; MA # ( р → p ) CYRILLIC SMALL LETTER ER → LATIN SMALL LETTER P # + +FF30 ; 0050 ; MA # ( P → P ) FULLWIDTH LATIN CAPITAL LETTER P → LATIN CAPITAL LETTER P # →Р→ +2119 ; 0050 ; MA # ( ℙ → P ) DOUBLE-STRUCK CAPITAL P → LATIN CAPITAL LETTER P # +1CCE5 ; 0050 ; MA #* ( 𜳥 → P ) OUTLINED LATIN CAPITAL LETTER P → LATIN CAPITAL LETTER P # +1D40F ; 0050 ; MA # ( 𝐏 → P ) MATHEMATICAL BOLD CAPITAL P → LATIN CAPITAL LETTER P # +1D443 ; 0050 ; MA # ( 𝑃 → P ) MATHEMATICAL ITALIC CAPITAL P → LATIN CAPITAL LETTER P # +1D477 ; 0050 ; MA # ( 𝑷 → P ) MATHEMATICAL BOLD ITALIC CAPITAL P → LATIN CAPITAL LETTER P # +1D4AB ; 0050 ; MA # ( 𝒫 → P ) MATHEMATICAL SCRIPT CAPITAL P → LATIN CAPITAL LETTER P # +1D4DF ; 0050 ; MA # ( 𝓟 → P ) MATHEMATICAL BOLD SCRIPT CAPITAL P → LATIN CAPITAL LETTER P # +1D513 ; 0050 ; MA # ( 𝔓 → P ) MATHEMATICAL FRAKTUR CAPITAL P → LATIN CAPITAL LETTER P # +1D57B ; 0050 ; MA # ( 𝕻 → P ) MATHEMATICAL BOLD FRAKTUR CAPITAL P → LATIN CAPITAL LETTER P # +1D5AF ; 0050 ; MA # ( 𝖯 → P ) MATHEMATICAL SANS-SERIF CAPITAL P → LATIN CAPITAL LETTER P # +1D5E3 ; 0050 ; MA # ( 𝗣 → P ) MATHEMATICAL SANS-SERIF BOLD CAPITAL P → LATIN CAPITAL LETTER P # +1D617 ; 0050 ; MA # ( 𝘗 → P ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL P → LATIN CAPITAL LETTER P # +1D64B ; 0050 ; MA # ( 𝙋 → P ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL P → LATIN CAPITAL LETTER P # +1D67F ; 0050 ; MA # ( 𝙿 → P ) MATHEMATICAL MONOSPACE CAPITAL P → LATIN CAPITAL LETTER P # +03A1 ; 0050 ; MA # ( Ρ → P ) GREEK CAPITAL LETTER RHO → LATIN CAPITAL LETTER P # +1D6B8 ; 0050 ; MA # ( 𝚸 → P ) MATHEMATICAL BOLD CAPITAL RHO → LATIN CAPITAL LETTER P # →𝐏→ +1D6F2 ; 0050 ; MA # ( 𝛲 → P ) MATHEMATICAL ITALIC CAPITAL RHO → LATIN CAPITAL LETTER P # →Ρ→ +1D72C ; 0050 ; MA # ( 𝜬 → P ) MATHEMATICAL BOLD ITALIC CAPITAL RHO → LATIN CAPITAL LETTER P # →Ρ→ +1D766 ; 0050 ; MA # ( 𝝦 → P ) MATHEMATICAL SANS-SERIF BOLD CAPITAL RHO → LATIN CAPITAL LETTER P # →Ρ→ +1D7A0 ; 0050 ; MA # ( 𝞠 → P ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL RHO → LATIN CAPITAL LETTER P # →Ρ→ +2CA2 ; 0050 ; MA # ( Ⲣ → P ) COPTIC CAPITAL LETTER RO → LATIN CAPITAL LETTER P # +2CCE ; 0050 ; MA # ( Ⳏ → P ) COPTIC CAPITAL LETTER OLD COPTIC HA → LATIN CAPITAL LETTER P # +0420 ; 0050 ; MA # ( Р → P ) CYRILLIC CAPITAL LETTER ER → LATIN CAPITAL LETTER P # +13E2 ; 0050 ; MA # ( Ꮲ → P ) CHEROKEE LETTER TLV → LATIN CAPITAL LETTER P # +146D ; 0050 ; MA # ( ᑭ → P ) CANADIAN SYLLABICS KI → LATIN CAPITAL LETTER P # +A4D1 ; 0050 ; MA # ( ꓑ → P ) LISU LETTER PA → LATIN CAPITAL LETTER P # +10295 ; 0050 ; MA # ( 𐊕 → P ) LYCIAN LETTER R → LATIN CAPITAL LETTER P # + +01A5 ; 0070 0314 ; MA # ( ƥ → p̔ ) LATIN SMALL LETTER P WITH HOOK → LATIN SMALL LETTER P, COMBINING REVERSED COMMA ABOVE # + +1D7D ; 0070 0335 ; MA # ( ᵽ → p̵ ) LATIN SMALL LETTER P WITH STROKE → LATIN SMALL LETTER P, COMBINING SHORT STROKE OVERLAY # + +1477 ; 0070 00B7 ; MA # ( ᑷ → p· ) CANADIAN SYLLABICS WEST-CREE KWI → LATIN SMALL LETTER P, MIDDLE DOT # →pᐧ→ + +1486 ; 0050 0027 ; MA # ( ᒆ → P' ) CANADIAN SYLLABICS SOUTH-SLAVEY KIH → LATIN CAPITAL LETTER P, APOSTROPHE # →ᑭᑊ→ + +1D29 ; 1D18 ; MA # ( ᴩ → ᴘ ) GREEK LETTER SMALL CAPITAL RHO → LATIN LETTER SMALL CAPITAL P # +ABB2 ; 1D18 ; MA # ( ꮲ → ᴘ ) CHEROKEE SMALL LETTER TLV → LATIN LETTER SMALL CAPITAL P # + +03C6 ; 0278 ; MA # ( φ → ɸ ) GREEK SMALL LETTER PHI → LATIN SMALL LETTER PHI # +03D5 ; 0278 ; MA # ( ϕ → ɸ ) GREEK PHI SYMBOL → LATIN SMALL LETTER PHI # +1D6D7 ; 0278 ; MA # ( 𝛗 → ɸ ) MATHEMATICAL BOLD SMALL PHI → LATIN SMALL LETTER PHI # →φ→ +1D6DF ; 0278 ; MA # ( 𝛟 → ɸ ) MATHEMATICAL BOLD PHI SYMBOL → LATIN SMALL LETTER PHI # →φ→ +1D711 ; 0278 ; MA # ( 𝜑 → ɸ ) MATHEMATICAL ITALIC SMALL PHI → LATIN SMALL LETTER PHI # →φ→ +1D719 ; 0278 ; MA # ( 𝜙 → ɸ ) MATHEMATICAL ITALIC PHI SYMBOL → LATIN SMALL LETTER PHI # →φ→ +1D74B ; 0278 ; MA # ( 𝝋 → ɸ ) MATHEMATICAL BOLD ITALIC SMALL PHI → LATIN SMALL LETTER PHI # →φ→ +1D753 ; 0278 ; MA # ( 𝝓 → ɸ ) MATHEMATICAL BOLD ITALIC PHI SYMBOL → LATIN SMALL LETTER PHI # →φ→ +1D785 ; 0278 ; MA # ( 𝞅 → ɸ ) MATHEMATICAL SANS-SERIF BOLD SMALL PHI → LATIN SMALL LETTER PHI # →φ→ +1D78D ; 0278 ; MA # ( 𝞍 → ɸ ) MATHEMATICAL SANS-SERIF BOLD PHI SYMBOL → LATIN SMALL LETTER PHI # →φ→ +1D7BF ; 0278 ; MA # ( 𝞿 → ɸ ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL PHI → LATIN SMALL LETTER PHI # →φ→ +1D7C7 ; 0278 ; MA # ( 𝟇 → ɸ ) MATHEMATICAL SANS-SERIF BOLD ITALIC PHI SYMBOL → LATIN SMALL LETTER PHI # →φ→ +2CAB ; 0278 ; MA # ( ⲫ → ɸ ) COPTIC SMALL LETTER FI → LATIN SMALL LETTER PHI # →ϕ→ +2CE1 ; 0278 ; MA # ( ⳡ → ɸ ) COPTIC SMALL LETTER OLD NUBIAN NYI → LATIN SMALL LETTER PHI # →φ→ +2CE0 ; 0278 ; MA # ( Ⳡ → ɸ ) COPTIC CAPITAL LETTER OLD NUBIAN NYI → LATIN SMALL LETTER PHI # →φ→ +0444 ; 0278 ; MA # ( ф → ɸ ) CYRILLIC SMALL LETTER EF → LATIN SMALL LETTER PHI # + +1D42A ; 0071 ; MA # ( 𝐪 → q ) MATHEMATICAL BOLD SMALL Q → LATIN SMALL LETTER Q # +1D45E ; 0071 ; MA # ( 𝑞 → q ) MATHEMATICAL ITALIC SMALL Q → LATIN SMALL LETTER Q # +1D492 ; 0071 ; MA # ( 𝒒 → q ) MATHEMATICAL BOLD ITALIC SMALL Q → LATIN SMALL LETTER Q # +1D4C6 ; 0071 ; MA # ( 𝓆 → q ) MATHEMATICAL SCRIPT SMALL Q → LATIN SMALL LETTER Q # +1D4FA ; 0071 ; MA # ( 𝓺 → q ) MATHEMATICAL BOLD SCRIPT SMALL Q → LATIN SMALL LETTER Q # +1D52E ; 0071 ; MA # ( 𝔮 → q ) MATHEMATICAL FRAKTUR SMALL Q → LATIN SMALL LETTER Q # +1D562 ; 0071 ; MA # ( 𝕢 → q ) MATHEMATICAL DOUBLE-STRUCK SMALL Q → LATIN SMALL LETTER Q # +1D596 ; 0071 ; MA # ( 𝖖 → q ) MATHEMATICAL BOLD FRAKTUR SMALL Q → LATIN SMALL LETTER Q # +1D5CA ; 0071 ; MA # ( 𝗊 → q ) MATHEMATICAL SANS-SERIF SMALL Q → LATIN SMALL LETTER Q # +1D5FE ; 0071 ; MA # ( 𝗾 → q ) MATHEMATICAL SANS-SERIF BOLD SMALL Q → LATIN SMALL LETTER Q # +1D632 ; 0071 ; MA # ( 𝘲 → q ) MATHEMATICAL SANS-SERIF ITALIC SMALL Q → LATIN SMALL LETTER Q # +1D666 ; 0071 ; MA # ( 𝙦 → q ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL Q → LATIN SMALL LETTER Q # +1D69A ; 0071 ; MA # ( 𝚚 → q ) MATHEMATICAL MONOSPACE SMALL Q → LATIN SMALL LETTER Q # +051B ; 0071 ; MA # ( ԛ → q ) CYRILLIC SMALL LETTER QA → LATIN SMALL LETTER Q # +0563 ; 0071 ; MA # ( գ → q ) ARMENIAN SMALL LETTER GIM → LATIN SMALL LETTER Q # +0566 ; 0071 ; MA # ( զ → q ) ARMENIAN SMALL LETTER ZA → LATIN SMALL LETTER Q # + +211A ; 0051 ; MA # ( ℚ → Q ) DOUBLE-STRUCK CAPITAL Q → LATIN CAPITAL LETTER Q # +1CCE6 ; 0051 ; MA #* ( 𜳦 → Q ) OUTLINED LATIN CAPITAL LETTER Q → LATIN CAPITAL LETTER Q # +1D410 ; 0051 ; MA # ( 𝐐 → Q ) MATHEMATICAL BOLD CAPITAL Q → LATIN CAPITAL LETTER Q # +1D444 ; 0051 ; MA # ( 𝑄 → Q ) MATHEMATICAL ITALIC CAPITAL Q → LATIN CAPITAL LETTER Q # +1D478 ; 0051 ; MA # ( 𝑸 → Q ) MATHEMATICAL BOLD ITALIC CAPITAL Q → LATIN CAPITAL LETTER Q # +1D4AC ; 0051 ; MA # ( 𝒬 → Q ) MATHEMATICAL SCRIPT CAPITAL Q → LATIN CAPITAL LETTER Q # +1D4E0 ; 0051 ; MA # ( 𝓠 → Q ) MATHEMATICAL BOLD SCRIPT CAPITAL Q → LATIN CAPITAL LETTER Q # +1D514 ; 0051 ; MA # ( 𝔔 → Q ) MATHEMATICAL FRAKTUR CAPITAL Q → LATIN CAPITAL LETTER Q # +1D57C ; 0051 ; MA # ( 𝕼 → Q ) MATHEMATICAL BOLD FRAKTUR CAPITAL Q → LATIN CAPITAL LETTER Q # +1D5B0 ; 0051 ; MA # ( 𝖰 → Q ) MATHEMATICAL SANS-SERIF CAPITAL Q → LATIN CAPITAL LETTER Q # +1D5E4 ; 0051 ; MA # ( 𝗤 → Q ) MATHEMATICAL SANS-SERIF BOLD CAPITAL Q → LATIN CAPITAL LETTER Q # +1D618 ; 0051 ; MA # ( 𝘘 → Q ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL Q → LATIN CAPITAL LETTER Q # +1D64C ; 0051 ; MA # ( 𝙌 → Q ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL Q → LATIN CAPITAL LETTER Q # +1D680 ; 0051 ; MA # ( 𝚀 → Q ) MATHEMATICAL MONOSPACE CAPITAL Q → LATIN CAPITAL LETTER Q # +2D55 ; 0051 ; MA # ( ⵕ → Q ) TIFINAGH LETTER YARR → LATIN CAPITAL LETTER Q # + +02A0 ; 0071 0314 ; MA # ( ʠ → q̔ ) LATIN SMALL LETTER Q WITH HOOK → LATIN SMALL LETTER Q, COMBINING REVERSED COMMA ABOVE # + +1F700 ; 0051 0045 ; MA #* ( 🜀 → QE ) ALCHEMICAL SYMBOL FOR QUINTESSENCE → LATIN CAPITAL LETTER Q, LATIN CAPITAL LETTER E # + +1D90 ; 024B ; MA # ( ᶐ → ɋ ) LATIN SMALL LETTER ALPHA WITH RETROFLEX HOOK → LATIN SMALL LETTER Q WITH HOOK TAIL # + +1D0B ; 0138 ; MA # ( ᴋ → ĸ ) LATIN LETTER SMALL CAPITAL K → LATIN SMALL LETTER KRA # +03BA ; 0138 ; MA # ( κ → ĸ ) GREEK SMALL LETTER KAPPA → LATIN SMALL LETTER KRA # +03F0 ; 0138 ; MA # ( ϰ → ĸ ) GREEK KAPPA SYMBOL → LATIN SMALL LETTER KRA # →κ→ +1D6CB ; 0138 ; MA # ( 𝛋 → ĸ ) MATHEMATICAL BOLD SMALL KAPPA → LATIN SMALL LETTER KRA # →κ→ +1D6DE ; 0138 ; MA # ( 𝛞 → ĸ ) MATHEMATICAL BOLD KAPPA SYMBOL → LATIN SMALL LETTER KRA # →κ→ +1D705 ; 0138 ; MA # ( 𝜅 → ĸ ) MATHEMATICAL ITALIC SMALL KAPPA → LATIN SMALL LETTER KRA # →κ→ +1D718 ; 0138 ; MA # ( 𝜘 → ĸ ) MATHEMATICAL ITALIC KAPPA SYMBOL → LATIN SMALL LETTER KRA # →κ→ +1D73F ; 0138 ; MA # ( 𝜿 → ĸ ) MATHEMATICAL BOLD ITALIC SMALL KAPPA → LATIN SMALL LETTER KRA # →κ→ +1D752 ; 0138 ; MA # ( 𝝒 → ĸ ) MATHEMATICAL BOLD ITALIC KAPPA SYMBOL → LATIN SMALL LETTER KRA # →κ→ +1D779 ; 0138 ; MA # ( 𝝹 → ĸ ) MATHEMATICAL SANS-SERIF BOLD SMALL KAPPA → LATIN SMALL LETTER KRA # →κ→ +1D78C ; 0138 ; MA # ( 𝞌 → ĸ ) MATHEMATICAL SANS-SERIF BOLD KAPPA SYMBOL → LATIN SMALL LETTER KRA # →κ→ +1D7B3 ; 0138 ; MA # ( 𝞳 → ĸ ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL KAPPA → LATIN SMALL LETTER KRA # →κ→ +1D7C6 ; 0138 ; MA # ( 𝟆 → ĸ ) MATHEMATICAL SANS-SERIF BOLD ITALIC KAPPA SYMBOL → LATIN SMALL LETTER KRA # →κ→ +2C95 ; 0138 ; MA # ( ⲕ → ĸ ) COPTIC SMALL LETTER KAPA → LATIN SMALL LETTER KRA # →κ→ +043A ; 0138 ; MA # ( к → ĸ ) CYRILLIC SMALL LETTER KA → LATIN SMALL LETTER KRA # +ABB6 ; 0138 ; MA # ( ꮶ → ĸ ) CHEROKEE SMALL LETTER TSO → LATIN SMALL LETTER KRA # →ᴋ→ + +049B ; 0138 0329 ; MA # ( қ → ĸ̩ ) CYRILLIC SMALL LETTER KA WITH DESCENDER → LATIN SMALL LETTER KRA, COMBINING VERTICAL LINE BELOW # →к̩→ + +049F ; 0138 0335 ; MA # ( ҟ → ĸ̵ ) CYRILLIC SMALL LETTER KA WITH STROKE → LATIN SMALL LETTER KRA, COMBINING SHORT STROKE OVERLAY # →к̵→ + +1D42B ; 0072 ; MA # ( 𝐫 → r ) MATHEMATICAL BOLD SMALL R → LATIN SMALL LETTER R # +1D45F ; 0072 ; MA # ( 𝑟 → r ) MATHEMATICAL ITALIC SMALL R → LATIN SMALL LETTER R # +1D493 ; 0072 ; MA # ( 𝒓 → r ) MATHEMATICAL BOLD ITALIC SMALL R → LATIN SMALL LETTER R # +1D4C7 ; 0072 ; MA # ( 𝓇 → r ) MATHEMATICAL SCRIPT SMALL R → LATIN SMALL LETTER R # +1D4FB ; 0072 ; MA # ( 𝓻 → r ) MATHEMATICAL BOLD SCRIPT SMALL R → LATIN SMALL LETTER R # +1D52F ; 0072 ; MA # ( 𝔯 → r ) MATHEMATICAL FRAKTUR SMALL R → LATIN SMALL LETTER R # +1D563 ; 0072 ; MA # ( 𝕣 → r ) MATHEMATICAL DOUBLE-STRUCK SMALL R → LATIN SMALL LETTER R # +1D597 ; 0072 ; MA # ( 𝖗 → r ) MATHEMATICAL BOLD FRAKTUR SMALL R → LATIN SMALL LETTER R # +1D5CB ; 0072 ; MA # ( 𝗋 → r ) MATHEMATICAL SANS-SERIF SMALL R → LATIN SMALL LETTER R # +1D5FF ; 0072 ; MA # ( 𝗿 → r ) MATHEMATICAL SANS-SERIF BOLD SMALL R → LATIN SMALL LETTER R # +1D633 ; 0072 ; MA # ( 𝘳 → r ) MATHEMATICAL SANS-SERIF ITALIC SMALL R → LATIN SMALL LETTER R # +1D667 ; 0072 ; MA # ( 𝙧 → r ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL R → LATIN SMALL LETTER R # +1D69B ; 0072 ; MA # ( 𝚛 → r ) MATHEMATICAL MONOSPACE SMALL R → LATIN SMALL LETTER R # +AB47 ; 0072 ; MA # ( ꭇ → r ) LATIN SMALL LETTER R WITHOUT HANDLE → LATIN SMALL LETTER R # +AB48 ; 0072 ; MA # ( ꭈ → r ) LATIN SMALL LETTER DOUBLE R → LATIN SMALL LETTER R # +1D26 ; 0072 ; MA # ( ᴦ → r ) GREEK LETTER SMALL CAPITAL GAMMA → LATIN SMALL LETTER R # →г→ +2C85 ; 0072 ; MA # ( ⲅ → r ) COPTIC SMALL LETTER GAMMA → LATIN SMALL LETTER R # →г→ +0433 ; 0072 ; MA # ( г → r ) CYRILLIC SMALL LETTER GHE → LATIN SMALL LETTER R # +AB81 ; 0072 ; MA # ( ꮁ → r ) CHEROKEE SMALL LETTER HU → LATIN SMALL LETTER R # →ᴦ→→г→ + +1D216 ; 0052 ; MA #* ( 𝈖 → R ) GREEK VOCAL NOTATION SYMBOL-23 → LATIN CAPITAL LETTER R # +211B ; 0052 ; MA # ( ℛ → R ) SCRIPT CAPITAL R → LATIN CAPITAL LETTER R # +211C ; 0052 ; MA # ( ℜ → R ) BLACK-LETTER CAPITAL R → LATIN CAPITAL LETTER R # +211D ; 0052 ; MA # ( ℝ → R ) DOUBLE-STRUCK CAPITAL R → LATIN CAPITAL LETTER R # +1CCE7 ; 0052 ; MA #* ( 𜳧 → R ) OUTLINED LATIN CAPITAL LETTER R → LATIN CAPITAL LETTER R # +1D411 ; 0052 ; MA # ( 𝐑 → R ) MATHEMATICAL BOLD CAPITAL R → LATIN CAPITAL LETTER R # +1D445 ; 0052 ; MA # ( 𝑅 → R ) MATHEMATICAL ITALIC CAPITAL R → LATIN CAPITAL LETTER R # +1D479 ; 0052 ; MA # ( 𝑹 → R ) MATHEMATICAL BOLD ITALIC CAPITAL R → LATIN CAPITAL LETTER R # +1D4E1 ; 0052 ; MA # ( 𝓡 → R ) MATHEMATICAL BOLD SCRIPT CAPITAL R → LATIN CAPITAL LETTER R # +1D57D ; 0052 ; MA # ( 𝕽 → R ) MATHEMATICAL BOLD FRAKTUR CAPITAL R → LATIN CAPITAL LETTER R # +1D5B1 ; 0052 ; MA # ( 𝖱 → R ) MATHEMATICAL SANS-SERIF CAPITAL R → LATIN CAPITAL LETTER R # +1D5E5 ; 0052 ; MA # ( 𝗥 → R ) MATHEMATICAL SANS-SERIF BOLD CAPITAL R → LATIN CAPITAL LETTER R # +1D619 ; 0052 ; MA # ( 𝘙 → R ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL R → LATIN CAPITAL LETTER R # +1D64D ; 0052 ; MA # ( 𝙍 → R ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL R → LATIN CAPITAL LETTER R # +1D681 ; 0052 ; MA # ( 𝚁 → R ) MATHEMATICAL MONOSPACE CAPITAL R → LATIN CAPITAL LETTER R # +01A6 ; 0052 ; MA # ( Ʀ → R ) LATIN LETTER YR → LATIN CAPITAL LETTER R # +13A1 ; 0052 ; MA # ( Ꭱ → R ) CHEROKEE LETTER E → LATIN CAPITAL LETTER R # +13D2 ; 0052 ; MA # ( Ꮢ → R ) CHEROKEE LETTER SV → LATIN CAPITAL LETTER R # +104B4 ; 0052 ; MA # ( 𐒴 → R ) OSAGE CAPITAL LETTER BRA → LATIN CAPITAL LETTER R # →Ʀ→ +1587 ; 0052 ; MA # ( ᖇ → R ) CANADIAN SYLLABICS TLHI → LATIN CAPITAL LETTER R # +A4E3 ; 0052 ; MA # ( ꓣ → R ) LISU LETTER ZHA → LATIN CAPITAL LETTER R # +16F35 ; 0052 ; MA # ( 𖼵 → R ) MIAO LETTER ZHA → LATIN CAPITAL LETTER R # + +027D ; 0072 0328 ; MA # ( ɽ → r̨ ) LATIN SMALL LETTER R WITH TAIL → LATIN SMALL LETTER R, COMBINING OGONEK # + +027C ; 0072 0329 ; MA # ( ɼ → r̩ ) LATIN SMALL LETTER R WITH LONG LEG → LATIN SMALL LETTER R, COMBINING VERTICAL LINE BELOW # + +024D ; 0072 0335 ; MA # ( ɍ → r̵ ) LATIN SMALL LETTER R WITH STROKE → LATIN SMALL LETTER R, COMBINING SHORT STROKE OVERLAY # +0493 ; 0072 0335 ; MA # ( ғ → r̵ ) CYRILLIC SMALL LETTER GHE WITH STROKE → LATIN SMALL LETTER R, COMBINING SHORT STROKE OVERLAY # →г̵→ + +1D72 ; 0072 0334 ; MA # ( ᵲ → r̴ ) LATIN SMALL LETTER R WITH MIDDLE TILDE → LATIN SMALL LETTER R, COMBINING TILDE OVERLAY # + +0491 ; 0072 0027 ; MA # ( ґ → r' ) CYRILLIC SMALL LETTER GHE WITH UPTURN → LATIN SMALL LETTER R, APOSTROPHE # →гˈ→ + +118E3 ; 0072 006E ; MA # ( 𑣣 → rn ) WARANG CITI DIGIT THREE → LATIN SMALL LETTER R, LATIN SMALL LETTER N # →m→ +006D ; 0072 006E ; MA # ( m → rn ) LATIN SMALL LETTER M → LATIN SMALL LETTER R, LATIN SMALL LETTER N # +217F ; 0072 006E ; MA # ( ⅿ → rn ) SMALL ROMAN NUMERAL ONE THOUSAND → LATIN SMALL LETTER R, LATIN SMALL LETTER N # →m→ +1D426 ; 0072 006E ; MA # ( 𝐦 → rn ) MATHEMATICAL BOLD SMALL M → LATIN SMALL LETTER R, LATIN SMALL LETTER N # →m→ +1D45A ; 0072 006E ; MA # ( 𝑚 → rn ) MATHEMATICAL ITALIC SMALL M → LATIN SMALL LETTER R, LATIN SMALL LETTER N # →m→ +1D48E ; 0072 006E ; MA # ( 𝒎 → rn ) MATHEMATICAL BOLD ITALIC SMALL M → LATIN SMALL LETTER R, LATIN SMALL LETTER N # →m→ +1D4C2 ; 0072 006E ; MA # ( 𝓂 → rn ) MATHEMATICAL SCRIPT SMALL M → LATIN SMALL LETTER R, LATIN SMALL LETTER N # →m→ +1D4F6 ; 0072 006E ; MA # ( 𝓶 → rn ) MATHEMATICAL BOLD SCRIPT SMALL M → LATIN SMALL LETTER R, LATIN SMALL LETTER N # →m→ +1D52A ; 0072 006E ; MA # ( 𝔪 → rn ) MATHEMATICAL FRAKTUR SMALL M → LATIN SMALL LETTER R, LATIN SMALL LETTER N # →m→ +1D55E ; 0072 006E ; MA # ( 𝕞 → rn ) MATHEMATICAL DOUBLE-STRUCK SMALL M → LATIN SMALL LETTER R, LATIN SMALL LETTER N # →m→ +1D592 ; 0072 006E ; MA # ( 𝖒 → rn ) MATHEMATICAL BOLD FRAKTUR SMALL M → LATIN SMALL LETTER R, LATIN SMALL LETTER N # →m→ +1D5C6 ; 0072 006E ; MA # ( 𝗆 → rn ) MATHEMATICAL SANS-SERIF SMALL M → LATIN SMALL LETTER R, LATIN SMALL LETTER N # →m→ +1D5FA ; 0072 006E ; MA # ( 𝗺 → rn ) MATHEMATICAL SANS-SERIF BOLD SMALL M → LATIN SMALL LETTER R, LATIN SMALL LETTER N # →m→ +1D62E ; 0072 006E ; MA # ( 𝘮 → rn ) MATHEMATICAL SANS-SERIF ITALIC SMALL M → LATIN SMALL LETTER R, LATIN SMALL LETTER N # →m→ +1D662 ; 0072 006E ; MA # ( 𝙢 → rn ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL M → LATIN SMALL LETTER R, LATIN SMALL LETTER N # →m→ +1D696 ; 0072 006E ; MA # ( 𝚖 → rn ) MATHEMATICAL MONOSPACE SMALL M → LATIN SMALL LETTER R, LATIN SMALL LETTER N # →m→ +11700 ; 0072 006E ; MA # ( 𑜀 → rn ) AHOM LETTER KA → LATIN SMALL LETTER R, LATIN SMALL LETTER N # →m→ + +20A5 ; 0072 006E 0338 ; MA #* ( ₥ → rn̸ ) MILL SIGN → LATIN SMALL LETTER R, LATIN SMALL LETTER N, COMBINING LONG SOLIDUS OVERLAY # →m̷→ + +0271 ; 0072 006E 0326 ; MA # ( ɱ → rn̦ ) LATIN SMALL LETTER M WITH HOOK → LATIN SMALL LETTER R, LATIN SMALL LETTER N, COMBINING COMMA BELOW # →m̡→ + +1D6F ; 0072 006E 0334 ; MA # ( ᵯ → rn̴ ) LATIN SMALL LETTER M WITH MIDDLE TILDE → LATIN SMALL LETTER R, LATIN SMALL LETTER N, COMBINING TILDE OVERLAY # →m̴→ + +20A8 ; 0052 0073 ; MA #* ( ₨ → Rs ) RUPEE SIGN → LATIN CAPITAL LETTER R, LATIN SMALL LETTER S # + +AB71 ; 0280 ; MA # ( ꭱ → ʀ ) CHEROKEE SMALL LETTER E → LATIN LETTER SMALL CAPITAL R # +ABA2 ; 0280 ; MA # ( ꮢ → ʀ ) CHEROKEE SMALL LETTER SV → LATIN LETTER SMALL CAPITAL R # + +044F ; 1D19 ; MA # ( я → ᴙ ) CYRILLIC SMALL LETTER YA → LATIN LETTER SMALL CAPITAL REVERSED R # + +1D73 ; 027E 0334 ; MA # ( ᵳ → ɾ̴ ) LATIN SMALL LETTER R WITH FISHHOOK AND MIDDLE TILDE → LATIN SMALL LETTER R WITH FISHHOOK, COMBINING TILDE OVERLAY # + +2129 ; 027F ; MA #* ( ℩ → ɿ ) TURNED GREEK SMALL LETTER IOTA → LATIN SMALL LETTER REVERSED R WITH FISHHOOK # + +FF53 ; 0073 ; MA # ( s → s ) FULLWIDTH LATIN SMALL LETTER S → LATIN SMALL LETTER S # →ѕ→ +1D42C ; 0073 ; MA # ( 𝐬 → s ) MATHEMATICAL BOLD SMALL S → LATIN SMALL LETTER S # +1D460 ; 0073 ; MA # ( 𝑠 → s ) MATHEMATICAL ITALIC SMALL S → LATIN SMALL LETTER S # +1D494 ; 0073 ; MA # ( 𝒔 → s ) MATHEMATICAL BOLD ITALIC SMALL S → LATIN SMALL LETTER S # +1D4C8 ; 0073 ; MA # ( 𝓈 → s ) MATHEMATICAL SCRIPT SMALL S → LATIN SMALL LETTER S # +1D4FC ; 0073 ; MA # ( 𝓼 → s ) MATHEMATICAL BOLD SCRIPT SMALL S → LATIN SMALL LETTER S # +1D530 ; 0073 ; MA # ( 𝔰 → s ) MATHEMATICAL FRAKTUR SMALL S → LATIN SMALL LETTER S # +1D564 ; 0073 ; MA # ( 𝕤 → s ) MATHEMATICAL DOUBLE-STRUCK SMALL S → LATIN SMALL LETTER S # +1D598 ; 0073 ; MA # ( 𝖘 → s ) MATHEMATICAL BOLD FRAKTUR SMALL S → LATIN SMALL LETTER S # +1D5CC ; 0073 ; MA # ( 𝗌 → s ) MATHEMATICAL SANS-SERIF SMALL S → LATIN SMALL LETTER S # +1D600 ; 0073 ; MA # ( 𝘀 → s ) MATHEMATICAL SANS-SERIF BOLD SMALL S → LATIN SMALL LETTER S # +1D634 ; 0073 ; MA # ( 𝘴 → s ) MATHEMATICAL SANS-SERIF ITALIC SMALL S → LATIN SMALL LETTER S # +1D668 ; 0073 ; MA # ( 𝙨 → s ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL S → LATIN SMALL LETTER S # +1D69C ; 0073 ; MA # ( 𝚜 → s ) MATHEMATICAL MONOSPACE SMALL S → LATIN SMALL LETTER S # +A731 ; 0073 ; MA # ( ꜱ → s ) LATIN LETTER SMALL CAPITAL S → LATIN SMALL LETTER S # +01BD ; 0073 ; MA # ( ƽ → s ) LATIN SMALL LETTER TONE FIVE → LATIN SMALL LETTER S # +0455 ; 0073 ; MA # ( ѕ → s ) CYRILLIC SMALL LETTER DZE → LATIN SMALL LETTER S # +0D1F ; 0073 ; MA # ( ട → s ) MALAYALAM LETTER TTA → LATIN SMALL LETTER S # +ABAA ; 0073 ; MA # ( ꮪ → s ) CHEROKEE SMALL LETTER DU → LATIN SMALL LETTER S # →ꜱ→ +118C1 ; 0073 ; MA # ( 𑣁 → s ) WARANG CITI SMALL LETTER A → LATIN SMALL LETTER S # +10448 ; 0073 ; MA # ( 𐑈 → s ) DESERET SMALL LETTER ZHEE → LATIN SMALL LETTER S # + +FF33 ; 0053 ; MA # ( S → S ) FULLWIDTH LATIN CAPITAL LETTER S → LATIN CAPITAL LETTER S # →Ѕ→ +1CCE8 ; 0053 ; MA #* ( 𜳨 → S ) OUTLINED LATIN CAPITAL LETTER S → LATIN CAPITAL LETTER S # +1D412 ; 0053 ; MA # ( 𝐒 → S ) MATHEMATICAL BOLD CAPITAL S → LATIN CAPITAL LETTER S # +1D446 ; 0053 ; MA # ( 𝑆 → S ) MATHEMATICAL ITALIC CAPITAL S → LATIN CAPITAL LETTER S # +1D47A ; 0053 ; MA # ( 𝑺 → S ) MATHEMATICAL BOLD ITALIC CAPITAL S → LATIN CAPITAL LETTER S # +1D4AE ; 0053 ; MA # ( 𝒮 → S ) MATHEMATICAL SCRIPT CAPITAL S → LATIN CAPITAL LETTER S # +1D4E2 ; 0053 ; MA # ( 𝓢 → S ) MATHEMATICAL BOLD SCRIPT CAPITAL S → LATIN CAPITAL LETTER S # +1D516 ; 0053 ; MA # ( 𝔖 → S ) MATHEMATICAL FRAKTUR CAPITAL S → LATIN CAPITAL LETTER S # +1D54A ; 0053 ; MA # ( 𝕊 → S ) MATHEMATICAL DOUBLE-STRUCK CAPITAL S → LATIN CAPITAL LETTER S # +1D57E ; 0053 ; MA # ( 𝕾 → S ) MATHEMATICAL BOLD FRAKTUR CAPITAL S → LATIN CAPITAL LETTER S # +1D5B2 ; 0053 ; MA # ( 𝖲 → S ) MATHEMATICAL SANS-SERIF CAPITAL S → LATIN CAPITAL LETTER S # +1D5E6 ; 0053 ; MA # ( 𝗦 → S ) MATHEMATICAL SANS-SERIF BOLD CAPITAL S → LATIN CAPITAL LETTER S # +1D61A ; 0053 ; MA # ( 𝘚 → S ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL S → LATIN CAPITAL LETTER S # +1D64E ; 0053 ; MA # ( 𝙎 → S ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL S → LATIN CAPITAL LETTER S # +1D682 ; 0053 ; MA # ( 𝚂 → S ) MATHEMATICAL MONOSPACE CAPITAL S → LATIN CAPITAL LETTER S # +0405 ; 0053 ; MA # ( Ѕ → S ) CYRILLIC CAPITAL LETTER DZE → LATIN CAPITAL LETTER S # +054F ; 0053 ; MA # ( Տ → S ) ARMENIAN CAPITAL LETTER TIWN → LATIN CAPITAL LETTER S # +13D5 ; 0053 ; MA # ( Ꮥ → S ) CHEROKEE LETTER DE → LATIN CAPITAL LETTER S # +13DA ; 0053 ; MA # ( Ꮪ → S ) CHEROKEE LETTER DU → LATIN CAPITAL LETTER S # +A4E2 ; 0053 ; MA # ( ꓢ → S ) LISU LETTER SA → LATIN CAPITAL LETTER S # +16F3A ; 0053 ; MA # ( 𖼺 → S ) MIAO LETTER SA → LATIN CAPITAL LETTER S # +10296 ; 0053 ; MA # ( 𐊖 → S ) LYCIAN LETTER S → LATIN CAPITAL LETTER S # +10420 ; 0053 ; MA # ( 𐐠 → S ) DESERET CAPITAL LETTER ZHEE → LATIN CAPITAL LETTER S # + +0282 ; 0073 0328 ; MA # ( ʂ → s̨ ) LATIN SMALL LETTER S WITH HOOK → LATIN SMALL LETTER S, COMBINING OGONEK # + +1D74 ; 0073 0334 ; MA # ( ᵴ → s̴ ) LATIN SMALL LETTER S WITH MIDDLE TILDE → LATIN SMALL LETTER S, COMBINING TILDE OVERLAY # + +A7B5 ; 00DF ; MA # ( ꞵ → ß ) LATIN SMALL LETTER BETA → LATIN SMALL LETTER SHARP S # →β→ +1E9E ; 00DF ; MA # ( ẞ → ß ) LATIN CAPITAL LETTER SHARP S → LATIN SMALL LETTER SHARP S # +A7D6 ; 00DF ; MA # ( Ꟗ → ß ) LATIN CAPITAL LETTER MIDDLE SCOTS S → LATIN SMALL LETTER SHARP S # →β→ +03B2 ; 00DF ; MA # ( β → ß ) GREEK SMALL LETTER BETA → LATIN SMALL LETTER SHARP S # +03D0 ; 00DF ; MA # ( ϐ → ß ) GREEK BETA SYMBOL → LATIN SMALL LETTER SHARP S # →β→ +1D6C3 ; 00DF ; MA # ( 𝛃 → ß ) MATHEMATICAL BOLD SMALL BETA → LATIN SMALL LETTER SHARP S # →β→ +1D6FD ; 00DF ; MA # ( 𝛽 → ß ) MATHEMATICAL ITALIC SMALL BETA → LATIN SMALL LETTER SHARP S # →β→ +1D737 ; 00DF ; MA # ( 𝜷 → ß ) MATHEMATICAL BOLD ITALIC SMALL BETA → LATIN SMALL LETTER SHARP S # →β→ +1D771 ; 00DF ; MA # ( 𝝱 → ß ) MATHEMATICAL SANS-SERIF BOLD SMALL BETA → LATIN SMALL LETTER SHARP S # →β→ +1D7AB ; 00DF ; MA # ( 𝞫 → ß ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL BETA → LATIN SMALL LETTER SHARP S # →β→ +13F0 ; 00DF ; MA # ( Ᏸ → ß ) CHEROKEE LETTER YE → LATIN SMALL LETTER SHARP S # →β→ + +1F75C ; 0073 0073 0073 ; MA #* ( 🝜 → sss ) ALCHEMICAL SYMBOL FOR STRATUM SUPER STRATUM → LATIN SMALL LETTER S, LATIN SMALL LETTER S, LATIN SMALL LETTER S # + +FB06 ; 0073 0074 ; MA # ( st → st ) LATIN SMALL LIGATURE ST → LATIN SMALL LETTER S, LATIN SMALL LETTER T # + +222B ; 0283 ; MA #* ( ∫ → ʃ ) INTEGRAL → LATIN SMALL LETTER ESH # +AB4D ; 0283 ; MA # ( ꭍ → ʃ ) LATIN SMALL LETTER BASELINE ESH → LATIN SMALL LETTER ESH # + +2211 ; 01A9 ; MA #* ( ∑ → Ʃ ) N-ARY SUMMATION → LATIN CAPITAL LETTER ESH # +2140 ; 01A9 ; MA #* ( ⅀ → Ʃ ) DOUBLE-STRUCK N-ARY SUMMATION → LATIN CAPITAL LETTER ESH # →∑→ +03A3 ; 01A9 ; MA # ( Σ → Ʃ ) GREEK CAPITAL LETTER SIGMA → LATIN CAPITAL LETTER ESH # +1D6BA ; 01A9 ; MA # ( 𝚺 → Ʃ ) MATHEMATICAL BOLD CAPITAL SIGMA → LATIN CAPITAL LETTER ESH # →Σ→ +1D6F4 ; 01A9 ; MA # ( 𝛴 → Ʃ ) MATHEMATICAL ITALIC CAPITAL SIGMA → LATIN CAPITAL LETTER ESH # →Σ→ +1D72E ; 01A9 ; MA # ( 𝜮 → Ʃ ) MATHEMATICAL BOLD ITALIC CAPITAL SIGMA → LATIN CAPITAL LETTER ESH # →Σ→ +1D768 ; 01A9 ; MA # ( 𝝨 → Ʃ ) MATHEMATICAL SANS-SERIF BOLD CAPITAL SIGMA → LATIN CAPITAL LETTER ESH # →Σ→ +1D7A2 ; 01A9 ; MA # ( 𝞢 → Ʃ ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL SIGMA → LATIN CAPITAL LETTER ESH # →Σ→ +2D49 ; 01A9 ; MA # ( ⵉ → Ʃ ) TIFINAGH LETTER YI → LATIN CAPITAL LETTER ESH # + +222C ; 0283 0283 ; MA #* ( ∬ → ʃʃ ) DOUBLE INTEGRAL → LATIN SMALL LETTER ESH, LATIN SMALL LETTER ESH # →∫∫→ + +222D ; 0283 0283 0283 ; MA #* ( ∭ → ʃʃʃ ) TRIPLE INTEGRAL → LATIN SMALL LETTER ESH, LATIN SMALL LETTER ESH, LATIN SMALL LETTER ESH # →∫∫∫→ + +2A0C ; 0283 0283 0283 0283 ; MA #* ( ⨌ → ʃʃʃʃ ) QUADRUPLE INTEGRAL OPERATOR → LATIN SMALL LETTER ESH, LATIN SMALL LETTER ESH, LATIN SMALL LETTER ESH, LATIN SMALL LETTER ESH # →∫∫∫∫→ + +1D42D ; 0074 ; MA # ( 𝐭 → t ) MATHEMATICAL BOLD SMALL T → LATIN SMALL LETTER T # +1D461 ; 0074 ; MA # ( 𝑡 → t ) MATHEMATICAL ITALIC SMALL T → LATIN SMALL LETTER T # +1D495 ; 0074 ; MA # ( 𝒕 → t ) MATHEMATICAL BOLD ITALIC SMALL T → LATIN SMALL LETTER T # +1D4C9 ; 0074 ; MA # ( 𝓉 → t ) MATHEMATICAL SCRIPT SMALL T → LATIN SMALL LETTER T # +1D4FD ; 0074 ; MA # ( 𝓽 → t ) MATHEMATICAL BOLD SCRIPT SMALL T → LATIN SMALL LETTER T # +1D531 ; 0074 ; MA # ( 𝔱 → t ) MATHEMATICAL FRAKTUR SMALL T → LATIN SMALL LETTER T # +1D565 ; 0074 ; MA # ( 𝕥 → t ) MATHEMATICAL DOUBLE-STRUCK SMALL T → LATIN SMALL LETTER T # +1D599 ; 0074 ; MA # ( 𝖙 → t ) MATHEMATICAL BOLD FRAKTUR SMALL T → LATIN SMALL LETTER T # +1D5CD ; 0074 ; MA # ( 𝗍 → t ) MATHEMATICAL SANS-SERIF SMALL T → LATIN SMALL LETTER T # +1D601 ; 0074 ; MA # ( 𝘁 → t ) MATHEMATICAL SANS-SERIF BOLD SMALL T → LATIN SMALL LETTER T # +1D635 ; 0074 ; MA # ( 𝘵 → t ) MATHEMATICAL SANS-SERIF ITALIC SMALL T → LATIN SMALL LETTER T # +1D669 ; 0074 ; MA # ( 𝙩 → t ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL T → LATIN SMALL LETTER T # +1D69D ; 0074 ; MA # ( 𝚝 → t ) MATHEMATICAL MONOSPACE SMALL T → LATIN SMALL LETTER T # + +22A4 ; 0054 ; MA #* ( ⊤ → T ) DOWN TACK → LATIN CAPITAL LETTER T # +27D9 ; 0054 ; MA #* ( ⟙ → T ) LARGE DOWN TACK → LATIN CAPITAL LETTER T # +1F768 ; 0054 ; MA #* ( 🝨 → T ) ALCHEMICAL SYMBOL FOR CRUCIBLE-4 → LATIN CAPITAL LETTER T # +FF34 ; 0054 ; MA # ( T → T ) FULLWIDTH LATIN CAPITAL LETTER T → LATIN CAPITAL LETTER T # →Т→ +1CCE9 ; 0054 ; MA #* ( 𜳩 → T ) OUTLINED LATIN CAPITAL LETTER T → LATIN CAPITAL LETTER T # +1D413 ; 0054 ; MA # ( 𝐓 → T ) MATHEMATICAL BOLD CAPITAL T → LATIN CAPITAL LETTER T # +1D447 ; 0054 ; MA # ( 𝑇 → T ) MATHEMATICAL ITALIC CAPITAL T → LATIN CAPITAL LETTER T # +1D47B ; 0054 ; MA # ( 𝑻 → T ) MATHEMATICAL BOLD ITALIC CAPITAL T → LATIN CAPITAL LETTER T # +1D4AF ; 0054 ; MA # ( 𝒯 → T ) MATHEMATICAL SCRIPT CAPITAL T → LATIN CAPITAL LETTER T # +1D4E3 ; 0054 ; MA # ( 𝓣 → T ) MATHEMATICAL BOLD SCRIPT CAPITAL T → LATIN CAPITAL LETTER T # +1D517 ; 0054 ; MA # ( 𝔗 → T ) MATHEMATICAL FRAKTUR CAPITAL T → LATIN CAPITAL LETTER T # +1D54B ; 0054 ; MA # ( 𝕋 → T ) MATHEMATICAL DOUBLE-STRUCK CAPITAL T → LATIN CAPITAL LETTER T # +1D57F ; 0054 ; MA # ( 𝕿 → T ) MATHEMATICAL BOLD FRAKTUR CAPITAL T → LATIN CAPITAL LETTER T # +1D5B3 ; 0054 ; MA # ( 𝖳 → T ) MATHEMATICAL SANS-SERIF CAPITAL T → LATIN CAPITAL LETTER T # +1D5E7 ; 0054 ; MA # ( 𝗧 → T ) MATHEMATICAL SANS-SERIF BOLD CAPITAL T → LATIN CAPITAL LETTER T # +1D61B ; 0054 ; MA # ( 𝘛 → T ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL T → LATIN CAPITAL LETTER T # +1D64F ; 0054 ; MA # ( 𝙏 → T ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL T → LATIN CAPITAL LETTER T # +1D683 ; 0054 ; MA # ( 𝚃 → T ) MATHEMATICAL MONOSPACE CAPITAL T → LATIN CAPITAL LETTER T # +03A4 ; 0054 ; MA # ( Τ → T ) GREEK CAPITAL LETTER TAU → LATIN CAPITAL LETTER T # +1D6BB ; 0054 ; MA # ( 𝚻 → T ) MATHEMATICAL BOLD CAPITAL TAU → LATIN CAPITAL LETTER T # →Τ→ +1D6F5 ; 0054 ; MA # ( 𝛵 → T ) MATHEMATICAL ITALIC CAPITAL TAU → LATIN CAPITAL LETTER T # →Τ→ +1D72F ; 0054 ; MA # ( 𝜯 → T ) MATHEMATICAL BOLD ITALIC CAPITAL TAU → LATIN CAPITAL LETTER T # →Τ→ +1D769 ; 0054 ; MA # ( 𝝩 → T ) MATHEMATICAL SANS-SERIF BOLD CAPITAL TAU → LATIN CAPITAL LETTER T # →Τ→ +1D7A3 ; 0054 ; MA # ( 𝞣 → T ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL TAU → LATIN CAPITAL LETTER T # →Τ→ +2CA6 ; 0054 ; MA # ( Ⲧ → T ) COPTIC CAPITAL LETTER TAU → LATIN CAPITAL LETTER T # +0422 ; 0054 ; MA # ( Т → T ) CYRILLIC CAPITAL LETTER TE → LATIN CAPITAL LETTER T # +13A2 ; 0054 ; MA # ( Ꭲ → T ) CHEROKEE LETTER I → LATIN CAPITAL LETTER T # +A4D4 ; 0054 ; MA # ( ꓔ → T ) LISU LETTER TA → LATIN CAPITAL LETTER T # +16F0A ; 0054 ; MA # ( 𖼊 → T ) MIAO LETTER TA → LATIN CAPITAL LETTER T # +118BC ; 0054 ; MA # ( 𑢼 → T ) WARANG CITI CAPITAL LETTER HAR → LATIN CAPITAL LETTER T # +10297 ; 0054 ; MA # ( 𐊗 → T ) LYCIAN LETTER T → LATIN CAPITAL LETTER T # +102B1 ; 0054 ; MA # ( 𐊱 → T ) CARIAN LETTER C-18 → LATIN CAPITAL LETTER T # +10315 ; 0054 ; MA # ( 𐌕 → T ) OLD ITALIC LETTER TE → LATIN CAPITAL LETTER T # + +01AD ; 0074 0314 ; MA # ( ƭ → t̔ ) LATIN SMALL LETTER T WITH HOOK → LATIN SMALL LETTER T, COMBINING REVERSED COMMA ABOVE # + +2361 ; 0054 0308 ; MA #* ( ⍡ → T̈ ) APL FUNCTIONAL SYMBOL UP TACK DIAERESIS → LATIN CAPITAL LETTER T, COMBINING DIAERESIS # →⊤̈→ + +023E ; 0054 0338 ; MA # ( Ⱦ → T̸ ) LATIN CAPITAL LETTER T WITH DIAGONAL STROKE → LATIN CAPITAL LETTER T, COMBINING LONG SOLIDUS OVERLAY # + +021A ; 0162 ; MA # ( Ț → Ţ ) LATIN CAPITAL LETTER T WITH COMMA BELOW → LATIN CAPITAL LETTER T WITH CEDILLA # + +01AE ; 0054 0328 ; MA # ( Ʈ → T̨ ) LATIN CAPITAL LETTER T WITH RETROFLEX HOOK → LATIN CAPITAL LETTER T, COMBINING OGONEK # + +04AC ; 0054 0329 ; MA # ( Ҭ → T̩ ) CYRILLIC CAPITAL LETTER TE WITH DESCENDER → LATIN CAPITAL LETTER T, COMBINING VERTICAL LINE BELOW # →Т̩→ + +20AE ; 0054 20EB ; MA #* ( ₮ → T⃫ ) TUGRIK SIGN → LATIN CAPITAL LETTER T, COMBINING LONG DOUBLE SOLIDUS OVERLAY # →Т⃫→ + +0167 ; 0074 0335 ; MA # ( ŧ → t̵ ) LATIN SMALL LETTER T WITH STROKE → LATIN SMALL LETTER T, COMBINING SHORT STROKE OVERLAY # + +0166 ; 0054 0335 ; MA # ( Ŧ → T̵ ) LATIN CAPITAL LETTER T WITH STROKE → LATIN CAPITAL LETTER T, COMBINING SHORT STROKE OVERLAY # + +1D75 ; 0074 0334 ; MA # ( ᵵ → t̴ ) LATIN SMALL LETTER T WITH MIDDLE TILDE → LATIN SMALL LETTER T, COMBINING TILDE OVERLAY # + +10A0 ; A786 ; MA # ( Ⴀ → Ꞇ ) GEORGIAN CAPITAL LETTER AN → LATIN CAPITAL LETTER INSULAR T # + +A728 ; 0054 0033 ; MA # ( Ꜩ → T3 ) LATIN CAPITAL LETTER TZ → LATIN CAPITAL LETTER T, DIGIT THREE # →TƷ→ + +02A8 ; 0074 0255 ; MA # ( ʨ → tɕ ) LATIN SMALL LETTER TC DIGRAPH WITH CURL → LATIN SMALL LETTER T, LATIN SMALL LETTER C WITH CURL # + +2121 ; 0054 0045 004C ; MA #* ( ℡ → TEL ) TELEPHONE SIGN → LATIN CAPITAL LETTER T, LATIN CAPITAL LETTER E, LATIN CAPITAL LETTER L # + +A777 ; 0074 0066 ; MA # ( ꝷ → tf ) LATIN SMALL LETTER TUM → LATIN SMALL LETTER T, LATIN SMALL LETTER F # + +02A6 ; 0074 0073 ; MA # ( ʦ → ts ) LATIN SMALL LETTER TS DIGRAPH → LATIN SMALL LETTER T, LATIN SMALL LETTER S # + +02A7 ; 0074 0283 ; MA # ( ʧ → tʃ ) LATIN SMALL LETTER TESH DIGRAPH → LATIN SMALL LETTER T, LATIN SMALL LETTER ESH # + +A729 ; 0074 021D ; MA # ( ꜩ → tȝ ) LATIN SMALL LETTER TZ → LATIN SMALL LETTER T, LATIN SMALL LETTER YOGH # + +03C4 ; 1D1B ; MA # ( τ → ᴛ ) GREEK SMALL LETTER TAU → LATIN LETTER SMALL CAPITAL T # +1D6D5 ; 1D1B ; MA # ( 𝛕 → ᴛ ) MATHEMATICAL BOLD SMALL TAU → LATIN LETTER SMALL CAPITAL T # +1D70F ; 1D1B ; MA # ( 𝜏 → ᴛ ) MATHEMATICAL ITALIC SMALL TAU → LATIN LETTER SMALL CAPITAL T # +1D749 ; 1D1B ; MA # ( 𝝉 → ᴛ ) MATHEMATICAL BOLD ITALIC SMALL TAU → LATIN LETTER SMALL CAPITAL T # +1D783 ; 1D1B ; MA # ( 𝞃 → ᴛ ) MATHEMATICAL SANS-SERIF BOLD SMALL TAU → LATIN LETTER SMALL CAPITAL T # +1D7BD ; 1D1B ; MA # ( 𝞽 → ᴛ ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL TAU → LATIN LETTER SMALL CAPITAL T # +2CA7 ; 1D1B ; MA # ( ⲧ → ᴛ ) COPTIC SMALL LETTER TAU → LATIN LETTER SMALL CAPITAL T # →т→ +0442 ; 1D1B ; MA # ( т → ᴛ ) CYRILLIC SMALL LETTER TE → LATIN LETTER SMALL CAPITAL T # +AB72 ; 1D1B ; MA # ( ꭲ → ᴛ ) CHEROKEE SMALL LETTER I → LATIN LETTER SMALL CAPITAL T # + +04AD ; 1D1B 0329 ; MA # ( ҭ → ᴛ̩ ) CYRILLIC SMALL LETTER TE WITH DESCENDER → LATIN LETTER SMALL CAPITAL T, COMBINING VERTICAL LINE BELOW # →т̩→ + +0163 ; 01AB ; MA # ( ţ → ƫ ) LATIN SMALL LETTER T WITH CEDILLA → LATIN SMALL LETTER T WITH PALATAL HOOK # +021B ; 01AB ; MA # ( ț → ƫ ) LATIN SMALL LETTER T WITH COMMA BELOW → LATIN SMALL LETTER T WITH PALATAL HOOK # →ţ→ +13BF ; 01AB ; MA # ( Ꮏ → ƫ ) CHEROKEE LETTER HNA → LATIN SMALL LETTER T WITH PALATAL HOOK # + +1D42E ; 0075 ; MA # ( 𝐮 → u ) MATHEMATICAL BOLD SMALL U → LATIN SMALL LETTER U # +1D462 ; 0075 ; MA # ( 𝑢 → u ) MATHEMATICAL ITALIC SMALL U → LATIN SMALL LETTER U # +1D496 ; 0075 ; MA # ( 𝒖 → u ) MATHEMATICAL BOLD ITALIC SMALL U → LATIN SMALL LETTER U # +1D4CA ; 0075 ; MA # ( 𝓊 → u ) MATHEMATICAL SCRIPT SMALL U → LATIN SMALL LETTER U # +1D4FE ; 0075 ; MA # ( 𝓾 → u ) MATHEMATICAL BOLD SCRIPT SMALL U → LATIN SMALL LETTER U # +1D532 ; 0075 ; MA # ( 𝔲 → u ) MATHEMATICAL FRAKTUR SMALL U → LATIN SMALL LETTER U # +1D566 ; 0075 ; MA # ( 𝕦 → u ) MATHEMATICAL DOUBLE-STRUCK SMALL U → LATIN SMALL LETTER U # +1D59A ; 0075 ; MA # ( 𝖚 → u ) MATHEMATICAL BOLD FRAKTUR SMALL U → LATIN SMALL LETTER U # +1D5CE ; 0075 ; MA # ( 𝗎 → u ) MATHEMATICAL SANS-SERIF SMALL U → LATIN SMALL LETTER U # +1D602 ; 0075 ; MA # ( 𝘂 → u ) MATHEMATICAL SANS-SERIF BOLD SMALL U → LATIN SMALL LETTER U # +1D636 ; 0075 ; MA # ( 𝘶 → u ) MATHEMATICAL SANS-SERIF ITALIC SMALL U → LATIN SMALL LETTER U # +1D66A ; 0075 ; MA # ( 𝙪 → u ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL U → LATIN SMALL LETTER U # +1D69E ; 0075 ; MA # ( 𝚞 → u ) MATHEMATICAL MONOSPACE SMALL U → LATIN SMALL LETTER U # +A79F ; 0075 ; MA # ( ꞟ → u ) LATIN SMALL LETTER VOLAPUK UE → LATIN SMALL LETTER U # +1D1C ; 0075 ; MA # ( ᴜ → u ) LATIN LETTER SMALL CAPITAL U → LATIN SMALL LETTER U # +AB4E ; 0075 ; MA # ( ꭎ → u ) LATIN SMALL LETTER U WITH SHORT RIGHT LEG → LATIN SMALL LETTER U # +AB52 ; 0075 ; MA # ( ꭒ → u ) LATIN SMALL LETTER U WITH LEFT HOOK → LATIN SMALL LETTER U # +028B ; 0075 ; MA # ( ʋ → u ) LATIN SMALL LETTER V WITH HOOK → LATIN SMALL LETTER U # +03C5 ; 0075 ; MA # ( υ → u ) GREEK SMALL LETTER UPSILON → LATIN SMALL LETTER U # →ʋ→ +1D6D6 ; 0075 ; MA # ( 𝛖 → u ) MATHEMATICAL BOLD SMALL UPSILON → LATIN SMALL LETTER U # →υ→→ʋ→ +1D710 ; 0075 ; MA # ( 𝜐 → u ) MATHEMATICAL ITALIC SMALL UPSILON → LATIN SMALL LETTER U # →υ→→ʋ→ +1D74A ; 0075 ; MA # ( 𝝊 → u ) MATHEMATICAL BOLD ITALIC SMALL UPSILON → LATIN SMALL LETTER U # →υ→→ʋ→ +1D784 ; 0075 ; MA # ( 𝞄 → u ) MATHEMATICAL SANS-SERIF BOLD SMALL UPSILON → LATIN SMALL LETTER U # →υ→→ʋ→ +1D7BE ; 0075 ; MA # ( 𝞾 → u ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL UPSILON → LATIN SMALL LETTER U # →υ→→ʋ→ +057D ; 0075 ; MA # ( ս → u ) ARMENIAN SMALL LETTER SEH → LATIN SMALL LETTER U # +104F6 ; 0075 ; MA # ( 𐓶 → u ) OSAGE SMALL LETTER U → LATIN SMALL LETTER U # →ᴜ→ +118D8 ; 0075 ; MA # ( 𑣘 → u ) WARANG CITI SMALL LETTER PU → LATIN SMALL LETTER U # →υ→→ʋ→ + +222A ; 0055 ; MA #* ( ∪ → U ) UNION → LATIN CAPITAL LETTER U # →ᑌ→ +22C3 ; 0055 ; MA #* ( ⋃ → U ) N-ARY UNION → LATIN CAPITAL LETTER U # →∪→→ᑌ→ +1CCEA ; 0055 ; MA #* ( 𜳪 → U ) OUTLINED LATIN CAPITAL LETTER U → LATIN CAPITAL LETTER U # +1D414 ; 0055 ; MA # ( 𝐔 → U ) MATHEMATICAL BOLD CAPITAL U → LATIN CAPITAL LETTER U # +1D448 ; 0055 ; MA # ( 𝑈 → U ) MATHEMATICAL ITALIC CAPITAL U → LATIN CAPITAL LETTER U # +1D47C ; 0055 ; MA # ( 𝑼 → U ) MATHEMATICAL BOLD ITALIC CAPITAL U → LATIN CAPITAL LETTER U # +1D4B0 ; 0055 ; MA # ( 𝒰 → U ) MATHEMATICAL SCRIPT CAPITAL U → LATIN CAPITAL LETTER U # +1D4E4 ; 0055 ; MA # ( 𝓤 → U ) MATHEMATICAL BOLD SCRIPT CAPITAL U → LATIN CAPITAL LETTER U # +1D518 ; 0055 ; MA # ( 𝔘 → U ) MATHEMATICAL FRAKTUR CAPITAL U → LATIN CAPITAL LETTER U # +1D54C ; 0055 ; MA # ( 𝕌 → U ) MATHEMATICAL DOUBLE-STRUCK CAPITAL U → LATIN CAPITAL LETTER U # +1D580 ; 0055 ; MA # ( 𝖀 → U ) MATHEMATICAL BOLD FRAKTUR CAPITAL U → LATIN CAPITAL LETTER U # +1D5B4 ; 0055 ; MA # ( 𝖴 → U ) MATHEMATICAL SANS-SERIF CAPITAL U → LATIN CAPITAL LETTER U # +1D5E8 ; 0055 ; MA # ( 𝗨 → U ) MATHEMATICAL SANS-SERIF BOLD CAPITAL U → LATIN CAPITAL LETTER U # +1D61C ; 0055 ; MA # ( 𝘜 → U ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL U → LATIN CAPITAL LETTER U # +1D650 ; 0055 ; MA # ( 𝙐 → U ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL U → LATIN CAPITAL LETTER U # +1D684 ; 0055 ; MA # ( 𝚄 → U ) MATHEMATICAL MONOSPACE CAPITAL U → LATIN CAPITAL LETTER U # +054D ; 0055 ; MA # ( Ս → U ) ARMENIAN CAPITAL LETTER SEH → LATIN CAPITAL LETTER U # +1200 ; 0055 ; MA # ( ሀ → U ) ETHIOPIC SYLLABLE HA → LATIN CAPITAL LETTER U # →Ս→ +104CE ; 0055 ; MA # ( 𐓎 → U ) OSAGE CAPITAL LETTER U → LATIN CAPITAL LETTER U # +144C ; 0055 ; MA # ( ᑌ → U ) CANADIAN SYLLABICS TE → LATIN CAPITAL LETTER U # +A4F4 ; 0055 ; MA # ( ꓴ → U ) LISU LETTER U → LATIN CAPITAL LETTER U # +16F42 ; 0055 ; MA # ( 𖽂 → U ) MIAO LETTER WA → LATIN CAPITAL LETTER U # +118B8 ; 0055 ; MA # ( 𑢸 → U ) WARANG CITI CAPITAL LETTER PU → LATIN CAPITAL LETTER U # + +01D4 ; 016D ; MA # ( ǔ → ŭ ) LATIN SMALL LETTER U WITH CARON → LATIN SMALL LETTER U WITH BREVE # + +01D3 ; 016C ; MA # ( Ǔ → Ŭ ) LATIN CAPITAL LETTER U WITH CARON → LATIN CAPITAL LETTER U WITH BREVE # + +045F ; 0075 0329 ; MA # ( џ → u̩ ) CYRILLIC SMALL LETTER DZHE → LATIN SMALL LETTER U, COMBINING VERTICAL LINE BELOW # + +1D7E ; 0075 0335 ; MA # ( ᵾ → u̵ ) LATIN SMALL CAPITAL LETTER U WITH STROKE → LATIN SMALL LETTER U, COMBINING SHORT STROKE OVERLAY # →ᴜ̵→ +AB9C ; 0075 0335 ; MA # ( ꮜ → u̵ ) CHEROKEE SMALL LETTER SA → LATIN SMALL LETTER U, COMBINING SHORT STROKE OVERLAY # →ᴜ̵→ + +0244 ; 0055 0335 ; MA # ( Ʉ → U̵ ) LATIN CAPITAL LETTER U BAR → LATIN CAPITAL LETTER U, COMBINING SHORT STROKE OVERLAY # →U̶→ +13CC ; 0055 0335 ; MA # ( Ꮜ → U̵ ) CHEROKEE LETTER SA → LATIN CAPITAL LETTER U, COMBINING SHORT STROKE OVERLAY # →Ʉ→→U̶→ + +1458 ; 0055 00B7 ; MA # ( ᑘ → U· ) CANADIAN SYLLABICS WEST-CREE TWE → LATIN CAPITAL LETTER U, MIDDLE DOT # →ᑌᐧ→→ᑌ·→ + +1467 ; 0055 0027 ; MA # ( ᑧ → U' ) CANADIAN SYLLABICS TTE → LATIN CAPITAL LETTER U, APOSTROPHE # →ᑌᑊ→→ᑌ'→ + +1D6B ; 0075 0065 ; MA # ( ᵫ → ue ) LATIN SMALL LETTER UE → LATIN SMALL LETTER U, LATIN SMALL LETTER E # + +AB63 ; 0075 006F ; MA # ( ꭣ → uo ) LATIN SMALL LETTER UO → LATIN SMALL LETTER U, LATIN SMALL LETTER O # + +1E43 ; AB51 ; MA # ( ṃ → ꭑ ) LATIN SMALL LETTER M WITH DOT BELOW → LATIN SMALL LETTER TURNED UI # + +057A ; 0270 ; MA # ( պ → ɰ ) ARMENIAN SMALL LETTER PEH → LATIN SMALL LETTER TURNED M WITH LONG LEG # +1223 ; 0270 ; MA # ( ሣ → ɰ ) ETHIOPIC SYLLABLE SZAA → LATIN SMALL LETTER TURNED M WITH LONG LEG # →պ→ + +2127 ; 01B1 ; MA #* ( ℧ → Ʊ ) INVERTED OHM SIGN → LATIN CAPITAL LETTER UPSILON # +162E ; 01B1 ; MA # ( ᘮ → Ʊ ) CANADIAN SYLLABICS CARRIER LHU → LATIN CAPITAL LETTER UPSILON # →℧→ +1634 ; 01B1 ; MA # ( ᘴ → Ʊ ) CANADIAN SYLLABICS CARRIER TLHU → LATIN CAPITAL LETTER UPSILON # →ᘮ→→℧→ + +1D7F ; 028A 0335 ; MA # ( ᵿ → ʊ̵ ) LATIN SMALL LETTER UPSILON WITH STROKE → LATIN SMALL LETTER UPSILON, COMBINING SHORT STROKE OVERLAY # + +2228 ; 0076 ; MA #* ( ∨ → v ) LOGICAL OR → LATIN SMALL LETTER V # +22C1 ; 0076 ; MA #* ( ⋁ → v ) N-ARY LOGICAL OR → LATIN SMALL LETTER V # →∨→ +FF56 ; 0076 ; MA # ( v → v ) FULLWIDTH LATIN SMALL LETTER V → LATIN SMALL LETTER V # →ν→ +2174 ; 0076 ; MA # ( ⅴ → v ) SMALL ROMAN NUMERAL FIVE → LATIN SMALL LETTER V # +1D42F ; 0076 ; MA # ( 𝐯 → v ) MATHEMATICAL BOLD SMALL V → LATIN SMALL LETTER V # +1D463 ; 0076 ; MA # ( 𝑣 → v ) MATHEMATICAL ITALIC SMALL V → LATIN SMALL LETTER V # +1D497 ; 0076 ; MA # ( 𝒗 → v ) MATHEMATICAL BOLD ITALIC SMALL V → LATIN SMALL LETTER V # +1D4CB ; 0076 ; MA # ( 𝓋 → v ) MATHEMATICAL SCRIPT SMALL V → LATIN SMALL LETTER V # +1D4FF ; 0076 ; MA # ( 𝓿 → v ) MATHEMATICAL BOLD SCRIPT SMALL V → LATIN SMALL LETTER V # +1D533 ; 0076 ; MA # ( 𝔳 → v ) MATHEMATICAL FRAKTUR SMALL V → LATIN SMALL LETTER V # +1D567 ; 0076 ; MA # ( 𝕧 → v ) MATHEMATICAL DOUBLE-STRUCK SMALL V → LATIN SMALL LETTER V # +1D59B ; 0076 ; MA # ( 𝖛 → v ) MATHEMATICAL BOLD FRAKTUR SMALL V → LATIN SMALL LETTER V # +1D5CF ; 0076 ; MA # ( 𝗏 → v ) MATHEMATICAL SANS-SERIF SMALL V → LATIN SMALL LETTER V # +1D603 ; 0076 ; MA # ( 𝘃 → v ) MATHEMATICAL SANS-SERIF BOLD SMALL V → LATIN SMALL LETTER V # +1D637 ; 0076 ; MA # ( 𝘷 → v ) MATHEMATICAL SANS-SERIF ITALIC SMALL V → LATIN SMALL LETTER V # +1D66B ; 0076 ; MA # ( 𝙫 → v ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL V → LATIN SMALL LETTER V # +1D69F ; 0076 ; MA # ( 𝚟 → v ) MATHEMATICAL MONOSPACE SMALL V → LATIN SMALL LETTER V # +1D20 ; 0076 ; MA # ( ᴠ → v ) LATIN LETTER SMALL CAPITAL V → LATIN SMALL LETTER V # +03BD ; 0076 ; MA # ( ν → v ) GREEK SMALL LETTER NU → LATIN SMALL LETTER V # +1D6CE ; 0076 ; MA # ( 𝛎 → v ) MATHEMATICAL BOLD SMALL NU → LATIN SMALL LETTER V # →ν→ +1D708 ; 0076 ; MA # ( 𝜈 → v ) MATHEMATICAL ITALIC SMALL NU → LATIN SMALL LETTER V # →ν→ +1D742 ; 0076 ; MA # ( 𝝂 → v ) MATHEMATICAL BOLD ITALIC SMALL NU → LATIN SMALL LETTER V # →ν→ +1D77C ; 0076 ; MA # ( 𝝼 → v ) MATHEMATICAL SANS-SERIF BOLD SMALL NU → LATIN SMALL LETTER V # →ν→ +1D7B6 ; 0076 ; MA # ( 𝞶 → v ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL NU → LATIN SMALL LETTER V # →ν→ +0475 ; 0076 ; MA # ( ѵ → v ) CYRILLIC SMALL LETTER IZHITSA → LATIN SMALL LETTER V # +05D8 ; 0076 ; MA # ( ‎ט‎ → v ) HEBREW LETTER TET → LATIN SMALL LETTER V # +11706 ; 0076 ; MA # ( 𑜆 → v ) AHOM LETTER PA → LATIN SMALL LETTER V # +ABA9 ; 0076 ; MA # ( ꮩ → v ) CHEROKEE SMALL LETTER DO → LATIN SMALL LETTER V # →ᴠ→ +118C0 ; 0076 ; MA # ( 𑣀 → v ) WARANG CITI SMALL LETTER NGAA → LATIN SMALL LETTER V # + +1D20D ; 0056 ; MA #* ( 𝈍 → V ) GREEK VOCAL NOTATION SYMBOL-14 → LATIN CAPITAL LETTER V # +0667 ; 0056 ; MA # ( ‎٧‎ → V ) ARABIC-INDIC DIGIT SEVEN → LATIN CAPITAL LETTER V # +06F7 ; 0056 ; MA # ( ۷ → V ) EXTENDED ARABIC-INDIC DIGIT SEVEN → LATIN CAPITAL LETTER V # →‎٧‎→ +2164 ; 0056 ; MA # ( Ⅴ → V ) ROMAN NUMERAL FIVE → LATIN CAPITAL LETTER V # +1CCEB ; 0056 ; MA #* ( 𜳫 → V ) OUTLINED LATIN CAPITAL LETTER V → LATIN CAPITAL LETTER V # +1D415 ; 0056 ; MA # ( 𝐕 → V ) MATHEMATICAL BOLD CAPITAL V → LATIN CAPITAL LETTER V # +1D449 ; 0056 ; MA # ( 𝑉 → V ) MATHEMATICAL ITALIC CAPITAL V → LATIN CAPITAL LETTER V # +1D47D ; 0056 ; MA # ( 𝑽 → V ) MATHEMATICAL BOLD ITALIC CAPITAL V → LATIN CAPITAL LETTER V # +1D4B1 ; 0056 ; MA # ( 𝒱 → V ) MATHEMATICAL SCRIPT CAPITAL V → LATIN CAPITAL LETTER V # +1D4E5 ; 0056 ; MA # ( 𝓥 → V ) MATHEMATICAL BOLD SCRIPT CAPITAL V → LATIN CAPITAL LETTER V # +1D519 ; 0056 ; MA # ( 𝔙 → V ) MATHEMATICAL FRAKTUR CAPITAL V → LATIN CAPITAL LETTER V # +1D54D ; 0056 ; MA # ( 𝕍 → V ) MATHEMATICAL DOUBLE-STRUCK CAPITAL V → LATIN CAPITAL LETTER V # +1D581 ; 0056 ; MA # ( 𝖁 → V ) MATHEMATICAL BOLD FRAKTUR CAPITAL V → LATIN CAPITAL LETTER V # +1D5B5 ; 0056 ; MA # ( 𝖵 → V ) MATHEMATICAL SANS-SERIF CAPITAL V → LATIN CAPITAL LETTER V # +1D5E9 ; 0056 ; MA # ( 𝗩 → V ) MATHEMATICAL SANS-SERIF BOLD CAPITAL V → LATIN CAPITAL LETTER V # +1D61D ; 0056 ; MA # ( 𝘝 → V ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL V → LATIN CAPITAL LETTER V # +1D651 ; 0056 ; MA # ( 𝙑 → V ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL V → LATIN CAPITAL LETTER V # +1D685 ; 0056 ; MA # ( 𝚅 → V ) MATHEMATICAL MONOSPACE CAPITAL V → LATIN CAPITAL LETTER V # +0474 ; 0056 ; MA # ( Ѵ → V ) CYRILLIC CAPITAL LETTER IZHITSA → LATIN CAPITAL LETTER V # +2D38 ; 0056 ; MA # ( ⴸ → V ) TIFINAGH LETTER YADH → LATIN CAPITAL LETTER V # +13D9 ; 0056 ; MA # ( Ꮩ → V ) CHEROKEE LETTER DO → LATIN CAPITAL LETTER V # +142F ; 0056 ; MA # ( ᐯ → V ) CANADIAN SYLLABICS PE → LATIN CAPITAL LETTER V # +A6DF ; 0056 ; MA # ( ꛟ → V ) BAMUM LETTER KO → LATIN CAPITAL LETTER V # +A4E6 ; 0056 ; MA # ( ꓦ → V ) LISU LETTER HA → LATIN CAPITAL LETTER V # +16F08 ; 0056 ; MA # ( 𖼈 → V ) MIAO LETTER VA → LATIN CAPITAL LETTER V # +118A0 ; 0056 ; MA # ( 𑢠 → V ) WARANG CITI CAPITAL LETTER NGAA → LATIN CAPITAL LETTER V # +1051D ; 0056 ; MA # ( 𐔝 → V ) ELBASAN LETTER TE → LATIN CAPITAL LETTER V # + +10197 ; 0056 0335 ; MA #* ( 𐆗 → V̵ ) ROMAN QUINARIUS SIGN → LATIN CAPITAL LETTER V, COMBINING SHORT STROKE OVERLAY # →V̶→ + +143B ; 0056 00B7 ; MA # ( ᐻ → V· ) CANADIAN SYLLABICS WEST-CREE PWE → LATIN CAPITAL LETTER V, MIDDLE DOT # →ᐯᐧ→ + +1F76C ; 0056 0042 ; MA #* ( 🝬 → VB ) ALCHEMICAL SYMBOL FOR BATH OF VAPOURS → LATIN CAPITAL LETTER V, LATIN CAPITAL LETTER B # + +2175 ; 0076 0069 ; MA # ( ⅵ → vi ) SMALL ROMAN NUMERAL SIX → LATIN SMALL LETTER V, LATIN SMALL LETTER I # + +2176 ; 0076 0069 0069 ; MA # ( ⅶ → vii ) SMALL ROMAN NUMERAL SEVEN → LATIN SMALL LETTER V, LATIN SMALL LETTER I, LATIN SMALL LETTER I # + +2177 ; 0076 0069 0069 0069 ; MA # ( ⅷ → viii ) SMALL ROMAN NUMERAL EIGHT → LATIN SMALL LETTER V, LATIN SMALL LETTER I, LATIN SMALL LETTER I, LATIN SMALL LETTER I # + +2165 ; 0056 006C ; MA # ( Ⅵ → Vl ) ROMAN NUMERAL SIX → LATIN CAPITAL LETTER V, LATIN SMALL LETTER L # →VI→ + +2166 ; 0056 006C 006C ; MA # ( Ⅶ → Vll ) ROMAN NUMERAL SEVEN → LATIN CAPITAL LETTER V, LATIN SMALL LETTER L, LATIN SMALL LETTER L # →VII→ + +2167 ; 0056 006C 006C 006C ; MA # ( Ⅷ → Vlll ) ROMAN NUMERAL EIGHT → LATIN CAPITAL LETTER V, LATIN SMALL LETTER L, LATIN SMALL LETTER L, LATIN SMALL LETTER L # →VIII→ + +1F708 ; 0056 1DE4 ; MA #* ( 🜈 → Vᷤ ) ALCHEMICAL SYMBOL FOR AQUA VITAE → LATIN CAPITAL LETTER V, COMBINING LATIN SMALL LETTER S # + +1D27 ; 028C ; MA # ( ᴧ → ʌ ) GREEK LETTER SMALL CAPITAL LAMDA → LATIN SMALL LETTER TURNED V # +2C97 ; 028C ; MA # ( ⲗ → ʌ ) COPTIC SMALL LETTER LAULA → LATIN SMALL LETTER TURNED V # +104D8 ; 028C ; MA # ( 𐓘 → ʌ ) OSAGE SMALL LETTER A → LATIN SMALL LETTER TURNED V # + +0668 ; 0245 ; MA # ( ‎٨‎ → Ʌ ) ARABIC-INDIC DIGIT EIGHT → LATIN CAPITAL LETTER TURNED V # →Λ→ +06F8 ; 0245 ; MA # ( ۸ → Ʌ ) EXTENDED ARABIC-INDIC DIGIT EIGHT → LATIN CAPITAL LETTER TURNED V # →‎٨‎→→Λ→ +A7DA ; 0245 ; MA # ( Ꟛ → Ʌ ) LATIN CAPITAL LETTER LAMBDA → LATIN CAPITAL LETTER TURNED V # →Λ→ +039B ; 0245 ; MA # ( Λ → Ʌ ) GREEK CAPITAL LETTER LAMDA → LATIN CAPITAL LETTER TURNED V # +1D6B2 ; 0245 ; MA # ( 𝚲 → Ʌ ) MATHEMATICAL BOLD CAPITAL LAMDA → LATIN CAPITAL LETTER TURNED V # →Λ→ +1D6EC ; 0245 ; MA # ( 𝛬 → Ʌ ) MATHEMATICAL ITALIC CAPITAL LAMDA → LATIN CAPITAL LETTER TURNED V # →Λ→ +1D726 ; 0245 ; MA # ( 𝜦 → Ʌ ) MATHEMATICAL BOLD ITALIC CAPITAL LAMDA → LATIN CAPITAL LETTER TURNED V # →Λ→ +1D760 ; 0245 ; MA # ( 𝝠 → Ʌ ) MATHEMATICAL SANS-SERIF BOLD CAPITAL LAMDA → LATIN CAPITAL LETTER TURNED V # →Λ→ +1D79A ; 0245 ; MA # ( 𝞚 → Ʌ ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL LAMDA → LATIN CAPITAL LETTER TURNED V # →Λ→ +041B ; 0245 ; MA # ( Л → Ʌ ) CYRILLIC CAPITAL LETTER EL → LATIN CAPITAL LETTER TURNED V # →Λ→ +2D37 ; 0245 ; MA # ( ⴷ → Ʌ ) TIFINAGH LETTER YAD → LATIN CAPITAL LETTER TURNED V # +104B0 ; 0245 ; MA # ( 𐒰 → Ʌ ) OSAGE CAPITAL LETTER A → LATIN CAPITAL LETTER TURNED V # +1431 ; 0245 ; MA # ( ᐱ → Ʌ ) CANADIAN SYLLABICS PI → LATIN CAPITAL LETTER TURNED V # +A6CE ; 0245 ; MA # ( ꛎ → Ʌ ) BAMUM LETTER MI → LATIN CAPITAL LETTER TURNED V # →Λ→ +A4E5 ; 0245 ; MA # ( ꓥ → Ʌ ) LISU LETTER NGA → LATIN CAPITAL LETTER TURNED V # +16F3D ; 0245 ; MA # ( 𖼽 → Ʌ ) MIAO LETTER ZZA → LATIN CAPITAL LETTER TURNED V # +1028D ; 0245 ; MA # ( 𐊍 → Ʌ ) LYCIAN LETTER L → LATIN CAPITAL LETTER TURNED V # →Λ→ + +A7DC ; 0245 0338 ; MA # ( Ƛ → Ʌ̸ ) LATIN CAPITAL LETTER LAMBDA WITH STROKE → LATIN CAPITAL LETTER TURNED V, COMBINING LONG SOLIDUS OVERLAY # →Λ̷→ + +04C5 ; 0245 0326 ; MA # ( Ӆ → Ʌ̦ ) CYRILLIC CAPITAL LETTER EL WITH TAIL → LATIN CAPITAL LETTER TURNED V, COMBINING COMMA BELOW # →Л̡→ + +143D ; 0245 00B7 ; MA # ( ᐽ → Ʌ· ) CANADIAN SYLLABICS WEST-CREE PWI → LATIN CAPITAL LETTER TURNED V, MIDDLE DOT # →ᐱᐧ→→ᐱ·→ + +026F ; 0077 ; MA # ( ɯ → w ) LATIN SMALL LETTER TURNED M → LATIN SMALL LETTER W # +1D430 ; 0077 ; MA # ( 𝐰 → w ) MATHEMATICAL BOLD SMALL W → LATIN SMALL LETTER W # +1D464 ; 0077 ; MA # ( 𝑤 → w ) MATHEMATICAL ITALIC SMALL W → LATIN SMALL LETTER W # +1D498 ; 0077 ; MA # ( 𝒘 → w ) MATHEMATICAL BOLD ITALIC SMALL W → LATIN SMALL LETTER W # +1D4CC ; 0077 ; MA # ( 𝓌 → w ) MATHEMATICAL SCRIPT SMALL W → LATIN SMALL LETTER W # +1D500 ; 0077 ; MA # ( 𝔀 → w ) MATHEMATICAL BOLD SCRIPT SMALL W → LATIN SMALL LETTER W # +1D534 ; 0077 ; MA # ( 𝔴 → w ) MATHEMATICAL FRAKTUR SMALL W → LATIN SMALL LETTER W # +1D568 ; 0077 ; MA # ( 𝕨 → w ) MATHEMATICAL DOUBLE-STRUCK SMALL W → LATIN SMALL LETTER W # +1D59C ; 0077 ; MA # ( 𝖜 → w ) MATHEMATICAL BOLD FRAKTUR SMALL W → LATIN SMALL LETTER W # +1D5D0 ; 0077 ; MA # ( 𝗐 → w ) MATHEMATICAL SANS-SERIF SMALL W → LATIN SMALL LETTER W # +1D604 ; 0077 ; MA # ( 𝘄 → w ) MATHEMATICAL SANS-SERIF BOLD SMALL W → LATIN SMALL LETTER W # +1D638 ; 0077 ; MA # ( 𝘸 → w ) MATHEMATICAL SANS-SERIF ITALIC SMALL W → LATIN SMALL LETTER W # +1D66C ; 0077 ; MA # ( 𝙬 → w ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL W → LATIN SMALL LETTER W # +1D6A0 ; 0077 ; MA # ( 𝚠 → w ) MATHEMATICAL MONOSPACE SMALL W → LATIN SMALL LETTER W # +1D21 ; 0077 ; MA # ( ᴡ → w ) LATIN LETTER SMALL CAPITAL W → LATIN SMALL LETTER W # +2CBD ; 0077 ; MA # ( ⲽ → w ) COPTIC SMALL LETTER CRYPTOGRAMMIC NI → LATIN SMALL LETTER W # →ш→ +0461 ; 0077 ; MA # ( ѡ → w ) CYRILLIC SMALL LETTER OMEGA → LATIN SMALL LETTER W # +0448 ; 0077 ; MA # ( ш → w ) CYRILLIC SMALL LETTER SHA → LATIN SMALL LETTER W # +051D ; 0077 ; MA # ( ԝ → w ) CYRILLIC SMALL LETTER WE → LATIN SMALL LETTER W # +0561 ; 0077 ; MA # ( ա → w ) ARMENIAN SMALL LETTER AYB → LATIN SMALL LETTER W # →ɯ→ +1170A ; 0077 ; MA # ( 𑜊 → w ) AHOM LETTER JA → LATIN SMALL LETTER W # +1170E ; 0077 ; MA # ( 𑜎 → w ) AHOM LETTER LA → LATIN SMALL LETTER W # +1170F ; 0077 ; MA # ( 𑜏 → w ) AHOM LETTER SA → LATIN SMALL LETTER W # +AB83 ; 0077 ; MA # ( ꮃ → w ) CHEROKEE SMALL LETTER LA → LATIN SMALL LETTER W # →ᴡ→ + +118E6 ; 0057 ; MA # ( 𑣦 → W ) WARANG CITI DIGIT SIX → LATIN CAPITAL LETTER W # +118EF ; 0057 ; MA #* ( 𑣯 → W ) WARANG CITI NUMBER SIXTY → LATIN CAPITAL LETTER W # +1CCEC ; 0057 ; MA #* ( 𜳬 → W ) OUTLINED LATIN CAPITAL LETTER W → LATIN CAPITAL LETTER W # +1D416 ; 0057 ; MA # ( 𝐖 → W ) MATHEMATICAL BOLD CAPITAL W → LATIN CAPITAL LETTER W # +1D44A ; 0057 ; MA # ( 𝑊 → W ) MATHEMATICAL ITALIC CAPITAL W → LATIN CAPITAL LETTER W # +1D47E ; 0057 ; MA # ( 𝑾 → W ) MATHEMATICAL BOLD ITALIC CAPITAL W → LATIN CAPITAL LETTER W # +1D4B2 ; 0057 ; MA # ( 𝒲 → W ) MATHEMATICAL SCRIPT CAPITAL W → LATIN CAPITAL LETTER W # +1D4E6 ; 0057 ; MA # ( 𝓦 → W ) MATHEMATICAL BOLD SCRIPT CAPITAL W → LATIN CAPITAL LETTER W # +1D51A ; 0057 ; MA # ( 𝔚 → W ) MATHEMATICAL FRAKTUR CAPITAL W → LATIN CAPITAL LETTER W # +1D54E ; 0057 ; MA # ( 𝕎 → W ) MATHEMATICAL DOUBLE-STRUCK CAPITAL W → LATIN CAPITAL LETTER W # +1D582 ; 0057 ; MA # ( 𝖂 → W ) MATHEMATICAL BOLD FRAKTUR CAPITAL W → LATIN CAPITAL LETTER W # +1D5B6 ; 0057 ; MA # ( 𝖶 → W ) MATHEMATICAL SANS-SERIF CAPITAL W → LATIN CAPITAL LETTER W # +1D5EA ; 0057 ; MA # ( 𝗪 → W ) MATHEMATICAL SANS-SERIF BOLD CAPITAL W → LATIN CAPITAL LETTER W # +1D61E ; 0057 ; MA # ( 𝘞 → W ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL W → LATIN CAPITAL LETTER W # +1D652 ; 0057 ; MA # ( 𝙒 → W ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL W → LATIN CAPITAL LETTER W # +1D686 ; 0057 ; MA # ( 𝚆 → W ) MATHEMATICAL MONOSPACE CAPITAL W → LATIN CAPITAL LETTER W # +051C ; 0057 ; MA # ( Ԝ → W ) CYRILLIC CAPITAL LETTER WE → LATIN CAPITAL LETTER W # +13B3 ; 0057 ; MA # ( Ꮃ → W ) CHEROKEE LETTER LA → LATIN CAPITAL LETTER W # +13D4 ; 0057 ; MA # ( Ꮤ → W ) CHEROKEE LETTER TA → LATIN CAPITAL LETTER W # +A4EA ; 0057 ; MA # ( ꓪ → W ) LISU LETTER WA → LATIN CAPITAL LETTER W # + +047D ; 0077 0486 0487 ; MA # ( ѽ → w҆҇ ) CYRILLIC SMALL LETTER OMEGA WITH TITLO → LATIN SMALL LETTER W, COMBINING CYRILLIC PSILI PNEUMATA, COMBINING CYRILLIC POKRYTIE # →ѡ҆҇→ + +114C5 ; 0077 0307 ; MA # ( 𑓅 → ẇ ) TIRHUTA GVANG → LATIN SMALL LETTER W, COMBINING DOT ABOVE # + +20A9 ; 0057 0335 ; MA #* ( ₩ → W̵ ) WON SIGN → LATIN CAPITAL LETTER W, COMBINING SHORT STROKE OVERLAY # + +A761 ; 0077 0326 ; MA # ( ꝡ → w̦ ) LATIN SMALL LETTER VY → LATIN SMALL LETTER W, COMBINING COMMA BELOW # →w̡→ + +1D0D ; 028D ; MA # ( ᴍ → ʍ ) LATIN LETTER SMALL CAPITAL M → LATIN SMALL LETTER TURNED W # →м→ +2C99 ; 028D ; MA # ( ⲙ → ʍ ) COPTIC SMALL LETTER MI → LATIN SMALL LETTER TURNED W # →ᴍ→→м→ +043C ; 028D ; MA # ( м → ʍ ) CYRILLIC SMALL LETTER EM → LATIN SMALL LETTER TURNED W # +AB87 ; 028D ; MA # ( ꮇ → ʍ ) CHEROKEE SMALL LETTER LU → LATIN SMALL LETTER TURNED W # →ᴍ→→м→ + +04CE ; 028D 0326 ; MA # ( ӎ → ʍ̦ ) CYRILLIC SMALL LETTER EM WITH TAIL → LATIN SMALL LETTER TURNED W, COMBINING COMMA BELOW # →м̡→ + +166E ; 0078 ; MA #* ( ᙮ → x ) CANADIAN SYLLABICS FULL STOP → LATIN SMALL LETTER X # +00D7 ; 0078 ; MA #* ( × → x ) MULTIPLICATION SIGN → LATIN SMALL LETTER X # +292B ; 0078 ; MA #* ( ⤫ → x ) RISING DIAGONAL CROSSING FALLING DIAGONAL → LATIN SMALL LETTER X # +292C ; 0078 ; MA #* ( ⤬ → x ) FALLING DIAGONAL CROSSING RISING DIAGONAL → LATIN SMALL LETTER X # +2A2F ; 0078 ; MA #* ( ⨯ → x ) VECTOR OR CROSS PRODUCT → LATIN SMALL LETTER X # →×→ +FF58 ; 0078 ; MA # ( x → x ) FULLWIDTH LATIN SMALL LETTER X → LATIN SMALL LETTER X # →х→ +2179 ; 0078 ; MA # ( ⅹ → x ) SMALL ROMAN NUMERAL TEN → LATIN SMALL LETTER X # +1D431 ; 0078 ; MA # ( 𝐱 → x ) MATHEMATICAL BOLD SMALL X → LATIN SMALL LETTER X # +1D465 ; 0078 ; MA # ( 𝑥 → x ) MATHEMATICAL ITALIC SMALL X → LATIN SMALL LETTER X # +1D499 ; 0078 ; MA # ( 𝒙 → x ) MATHEMATICAL BOLD ITALIC SMALL X → LATIN SMALL LETTER X # +1D4CD ; 0078 ; MA # ( 𝓍 → x ) MATHEMATICAL SCRIPT SMALL X → LATIN SMALL LETTER X # +1D501 ; 0078 ; MA # ( 𝔁 → x ) MATHEMATICAL BOLD SCRIPT SMALL X → LATIN SMALL LETTER X # +1D535 ; 0078 ; MA # ( 𝔵 → x ) MATHEMATICAL FRAKTUR SMALL X → LATIN SMALL LETTER X # +1D569 ; 0078 ; MA # ( 𝕩 → x ) MATHEMATICAL DOUBLE-STRUCK SMALL X → LATIN SMALL LETTER X # +1D59D ; 0078 ; MA # ( 𝖝 → x ) MATHEMATICAL BOLD FRAKTUR SMALL X → LATIN SMALL LETTER X # +1D5D1 ; 0078 ; MA # ( 𝗑 → x ) MATHEMATICAL SANS-SERIF SMALL X → LATIN SMALL LETTER X # +1D605 ; 0078 ; MA # ( 𝘅 → x ) MATHEMATICAL SANS-SERIF BOLD SMALL X → LATIN SMALL LETTER X # +1D639 ; 0078 ; MA # ( 𝘹 → x ) MATHEMATICAL SANS-SERIF ITALIC SMALL X → LATIN SMALL LETTER X # +1D66D ; 0078 ; MA # ( 𝙭 → x ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL X → LATIN SMALL LETTER X # +1D6A1 ; 0078 ; MA # ( 𝚡 → x ) MATHEMATICAL MONOSPACE SMALL X → LATIN SMALL LETTER X # +0445 ; 0078 ; MA # ( х → x ) CYRILLIC SMALL LETTER HA → LATIN SMALL LETTER X # +1541 ; 0078 ; MA # ( ᕁ → x ) CANADIAN SYLLABICS SAYISI YI → LATIN SMALL LETTER X # →᙮→ +157D ; 0078 ; MA # ( ᕽ → x ) CANADIAN SYLLABICS HK → LATIN SMALL LETTER X # →ᕁ→→᙮→ + +2DEF ; 036F ; MA # ( ⷯ → ͯ ) COMBINING CYRILLIC LETTER HA → COMBINING LATIN SMALL LETTER X # + +166D ; 0058 ; MA #* ( ᙭ → X ) CANADIAN SYLLABICS CHI SIGN → LATIN CAPITAL LETTER X # +2573 ; 0058 ; MA #* ( ╳ → X ) BOX DRAWINGS LIGHT DIAGONAL CROSS → LATIN CAPITAL LETTER X # +10322 ; 0058 ; MA #* ( 𐌢 → X ) OLD ITALIC NUMERAL TEN → LATIN CAPITAL LETTER X # →𐌗→ +118EC ; 0058 ; MA #* ( 𑣬 → X ) WARANG CITI NUMBER THIRTY → LATIN CAPITAL LETTER X # +FF38 ; 0058 ; MA # ( X → X ) FULLWIDTH LATIN CAPITAL LETTER X → LATIN CAPITAL LETTER X # →Х→ +2169 ; 0058 ; MA # ( Ⅹ → X ) ROMAN NUMERAL TEN → LATIN CAPITAL LETTER X # +1CCED ; 0058 ; MA #* ( 𜳭 → X ) OUTLINED LATIN CAPITAL LETTER X → LATIN CAPITAL LETTER X # +1D417 ; 0058 ; MA # ( 𝐗 → X ) MATHEMATICAL BOLD CAPITAL X → LATIN CAPITAL LETTER X # +1D44B ; 0058 ; MA # ( 𝑋 → X ) MATHEMATICAL ITALIC CAPITAL X → LATIN CAPITAL LETTER X # +1D47F ; 0058 ; MA # ( 𝑿 → X ) MATHEMATICAL BOLD ITALIC CAPITAL X → LATIN CAPITAL LETTER X # +1D4B3 ; 0058 ; MA # ( 𝒳 → X ) MATHEMATICAL SCRIPT CAPITAL X → LATIN CAPITAL LETTER X # +1D4E7 ; 0058 ; MA # ( 𝓧 → X ) MATHEMATICAL BOLD SCRIPT CAPITAL X → LATIN CAPITAL LETTER X # +1D51B ; 0058 ; MA # ( 𝔛 → X ) MATHEMATICAL FRAKTUR CAPITAL X → LATIN CAPITAL LETTER X # +1D54F ; 0058 ; MA # ( 𝕏 → X ) MATHEMATICAL DOUBLE-STRUCK CAPITAL X → LATIN CAPITAL LETTER X # +1D583 ; 0058 ; MA # ( 𝖃 → X ) MATHEMATICAL BOLD FRAKTUR CAPITAL X → LATIN CAPITAL LETTER X # +1D5B7 ; 0058 ; MA # ( 𝖷 → X ) MATHEMATICAL SANS-SERIF CAPITAL X → LATIN CAPITAL LETTER X # +1D5EB ; 0058 ; MA # ( 𝗫 → X ) MATHEMATICAL SANS-SERIF BOLD CAPITAL X → LATIN CAPITAL LETTER X # +1D61F ; 0058 ; MA # ( 𝘟 → X ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL X → LATIN CAPITAL LETTER X # +1D653 ; 0058 ; MA # ( 𝙓 → X ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL X → LATIN CAPITAL LETTER X # +1D687 ; 0058 ; MA # ( 𝚇 → X ) MATHEMATICAL MONOSPACE CAPITAL X → LATIN CAPITAL LETTER X # +A7B3 ; 0058 ; MA # ( Ꭓ → X ) LATIN CAPITAL LETTER CHI → LATIN CAPITAL LETTER X # +03A7 ; 0058 ; MA # ( Χ → X ) GREEK CAPITAL LETTER CHI → LATIN CAPITAL LETTER X # +1D6BE ; 0058 ; MA # ( 𝚾 → X ) MATHEMATICAL BOLD CAPITAL CHI → LATIN CAPITAL LETTER X # →Χ→ +1D6F8 ; 0058 ; MA # ( 𝛸 → X ) MATHEMATICAL ITALIC CAPITAL CHI → LATIN CAPITAL LETTER X # →Χ→ +1D732 ; 0058 ; MA # ( 𝜲 → X ) MATHEMATICAL BOLD ITALIC CAPITAL CHI → LATIN CAPITAL LETTER X # →𝑿→ +1D76C ; 0058 ; MA # ( 𝝬 → X ) MATHEMATICAL SANS-SERIF BOLD CAPITAL CHI → LATIN CAPITAL LETTER X # →Χ→ +1D7A6 ; 0058 ; MA # ( 𝞦 → X ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL CHI → LATIN CAPITAL LETTER X # →Χ→ +2CAC ; 0058 ; MA # ( Ⲭ → X ) COPTIC CAPITAL LETTER KHI → LATIN CAPITAL LETTER X # →Х→ +0425 ; 0058 ; MA # ( Х → X ) CYRILLIC CAPITAL LETTER HA → LATIN CAPITAL LETTER X # +2D5D ; 0058 ; MA # ( ⵝ → X ) TIFINAGH LETTER YATH → LATIN CAPITAL LETTER X # +16B7 ; 0058 ; MA # ( ᚷ → X ) RUNIC LETTER GEBO GYFU G → LATIN CAPITAL LETTER X # +A4EB ; 0058 ; MA # ( ꓫ → X ) LISU LETTER SHA → LATIN CAPITAL LETTER X # +10290 ; 0058 ; MA # ( 𐊐 → X ) LYCIAN LETTER MM → LATIN CAPITAL LETTER X # +102B4 ; 0058 ; MA # ( 𐊴 → X ) CARIAN LETTER X → LATIN CAPITAL LETTER X # +10317 ; 0058 ; MA # ( 𐌗 → X ) OLD ITALIC LETTER EKS → LATIN CAPITAL LETTER X # +10527 ; 0058 ; MA # ( 𐔧 → X ) ELBASAN LETTER KHE → LATIN CAPITAL LETTER X # + +2A30 ; 0078 0307 ; MA #* ( ⨰ → ẋ ) MULTIPLICATION SIGN WITH DOT ABOVE → LATIN SMALL LETTER X, COMBINING DOT ABOVE # →×̇→ + +04B2 ; 0058 0329 ; MA # ( Ҳ → X̩ ) CYRILLIC CAPITAL LETTER HA WITH DESCENDER → LATIN CAPITAL LETTER X, COMBINING VERTICAL LINE BELOW # →Х̩→ + +10196 ; 0058 0335 ; MA #* ( 𐆖 → X̵ ) ROMAN DENARIUS SIGN → LATIN CAPITAL LETTER X, COMBINING SHORT STROKE OVERLAY # →X̶→ + +217A ; 0078 0069 ; MA # ( ⅺ → xi ) SMALL ROMAN NUMERAL ELEVEN → LATIN SMALL LETTER X, LATIN SMALL LETTER I # + +217B ; 0078 0069 0069 ; MA # ( ⅻ → xii ) SMALL ROMAN NUMERAL TWELVE → LATIN SMALL LETTER X, LATIN SMALL LETTER I, LATIN SMALL LETTER I # + +216A ; 0058 006C ; MA # ( Ⅺ → Xl ) ROMAN NUMERAL ELEVEN → LATIN CAPITAL LETTER X, LATIN SMALL LETTER L # →XI→ + +216B ; 0058 006C 006C ; MA # ( Ⅻ → Xll ) ROMAN NUMERAL TWELVE → LATIN CAPITAL LETTER X, LATIN SMALL LETTER L, LATIN SMALL LETTER L # →XII→ + +0263 ; 0079 ; MA # ( ɣ → y ) LATIN SMALL LETTER GAMMA → LATIN SMALL LETTER Y # →γ→ +1D8C ; 0079 ; MA # ( ᶌ → y ) LATIN SMALL LETTER V WITH PALATAL HOOK → LATIN SMALL LETTER Y # +FF59 ; 0079 ; MA # ( y → y ) FULLWIDTH LATIN SMALL LETTER Y → LATIN SMALL LETTER Y # →у→ +1D432 ; 0079 ; MA # ( 𝐲 → y ) MATHEMATICAL BOLD SMALL Y → LATIN SMALL LETTER Y # +1D466 ; 0079 ; MA # ( 𝑦 → y ) MATHEMATICAL ITALIC SMALL Y → LATIN SMALL LETTER Y # +1D49A ; 0079 ; MA # ( 𝒚 → y ) MATHEMATICAL BOLD ITALIC SMALL Y → LATIN SMALL LETTER Y # +1D4CE ; 0079 ; MA # ( 𝓎 → y ) MATHEMATICAL SCRIPT SMALL Y → LATIN SMALL LETTER Y # +1D502 ; 0079 ; MA # ( 𝔂 → y ) MATHEMATICAL BOLD SCRIPT SMALL Y → LATIN SMALL LETTER Y # +1D536 ; 0079 ; MA # ( 𝔶 → y ) MATHEMATICAL FRAKTUR SMALL Y → LATIN SMALL LETTER Y # +1D56A ; 0079 ; MA # ( 𝕪 → y ) MATHEMATICAL DOUBLE-STRUCK SMALL Y → LATIN SMALL LETTER Y # +1D59E ; 0079 ; MA # ( 𝖞 → y ) MATHEMATICAL BOLD FRAKTUR SMALL Y → LATIN SMALL LETTER Y # +1D5D2 ; 0079 ; MA # ( 𝗒 → y ) MATHEMATICAL SANS-SERIF SMALL Y → LATIN SMALL LETTER Y # +1D606 ; 0079 ; MA # ( 𝘆 → y ) MATHEMATICAL SANS-SERIF BOLD SMALL Y → LATIN SMALL LETTER Y # +1D63A ; 0079 ; MA # ( 𝘺 → y ) MATHEMATICAL SANS-SERIF ITALIC SMALL Y → LATIN SMALL LETTER Y # +1D66E ; 0079 ; MA # ( 𝙮 → y ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL Y → LATIN SMALL LETTER Y # +1D6A2 ; 0079 ; MA # ( 𝚢 → y ) MATHEMATICAL MONOSPACE SMALL Y → LATIN SMALL LETTER Y # +028F ; 0079 ; MA # ( ʏ → y ) LATIN LETTER SMALL CAPITAL Y → LATIN SMALL LETTER Y # →ү→→γ→ +1EFF ; 0079 ; MA # ( ỿ → y ) LATIN SMALL LETTER Y WITH LOOP → LATIN SMALL LETTER Y # +AB5A ; 0079 ; MA # ( ꭚ → y ) LATIN SMALL LETTER Y WITH SHORT RIGHT LEG → LATIN SMALL LETTER Y # +03B3 ; 0079 ; MA # ( γ → y ) GREEK SMALL LETTER GAMMA → LATIN SMALL LETTER Y # +213D ; 0079 ; MA # ( ℽ → y ) DOUBLE-STRUCK SMALL GAMMA → LATIN SMALL LETTER Y # →γ→ +1D6C4 ; 0079 ; MA # ( 𝛄 → y ) MATHEMATICAL BOLD SMALL GAMMA → LATIN SMALL LETTER Y # →γ→ +1D6FE ; 0079 ; MA # ( 𝛾 → y ) MATHEMATICAL ITALIC SMALL GAMMA → LATIN SMALL LETTER Y # →γ→ +1D738 ; 0079 ; MA # ( 𝜸 → y ) MATHEMATICAL BOLD ITALIC SMALL GAMMA → LATIN SMALL LETTER Y # →γ→ +1D772 ; 0079 ; MA # ( 𝝲 → y ) MATHEMATICAL SANS-SERIF BOLD SMALL GAMMA → LATIN SMALL LETTER Y # →γ→ +1D7AC ; 0079 ; MA # ( 𝞬 → y ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL GAMMA → LATIN SMALL LETTER Y # →γ→ +2CA9 ; 0079 ; MA # ( ⲩ → y ) COPTIC SMALL LETTER UA → LATIN SMALL LETTER Y # →γ→ +0443 ; 0079 ; MA # ( у → y ) CYRILLIC SMALL LETTER U → LATIN SMALL LETTER Y # +04AF ; 0079 ; MA # ( ү → y ) CYRILLIC SMALL LETTER STRAIGHT U → LATIN SMALL LETTER Y # →γ→ +10E7 ; 0079 ; MA # ( ყ → y ) GEORGIAN LETTER QAR → LATIN SMALL LETTER Y # +118DC ; 0079 ; MA # ( 𑣜 → y ) WARANG CITI SMALL LETTER HAR → LATIN SMALL LETTER Y # →ɣ→→γ→ + +FF39 ; 0059 ; MA # ( Y → Y ) FULLWIDTH LATIN CAPITAL LETTER Y → LATIN CAPITAL LETTER Y # →Υ→ +1CCEE ; 0059 ; MA #* ( 𜳮 → Y ) OUTLINED LATIN CAPITAL LETTER Y → LATIN CAPITAL LETTER Y # +1D418 ; 0059 ; MA # ( 𝐘 → Y ) MATHEMATICAL BOLD CAPITAL Y → LATIN CAPITAL LETTER Y # +1D44C ; 0059 ; MA # ( 𝑌 → Y ) MATHEMATICAL ITALIC CAPITAL Y → LATIN CAPITAL LETTER Y # +1D480 ; 0059 ; MA # ( 𝒀 → Y ) MATHEMATICAL BOLD ITALIC CAPITAL Y → LATIN CAPITAL LETTER Y # +1D4B4 ; 0059 ; MA # ( 𝒴 → Y ) MATHEMATICAL SCRIPT CAPITAL Y → LATIN CAPITAL LETTER Y # +1D4E8 ; 0059 ; MA # ( 𝓨 → Y ) MATHEMATICAL BOLD SCRIPT CAPITAL Y → LATIN CAPITAL LETTER Y # +1D51C ; 0059 ; MA # ( 𝔜 → Y ) MATHEMATICAL FRAKTUR CAPITAL Y → LATIN CAPITAL LETTER Y # +1D550 ; 0059 ; MA # ( 𝕐 → Y ) MATHEMATICAL DOUBLE-STRUCK CAPITAL Y → LATIN CAPITAL LETTER Y # +1D584 ; 0059 ; MA # ( 𝖄 → Y ) MATHEMATICAL BOLD FRAKTUR CAPITAL Y → LATIN CAPITAL LETTER Y # +1D5B8 ; 0059 ; MA # ( 𝖸 → Y ) MATHEMATICAL SANS-SERIF CAPITAL Y → LATIN CAPITAL LETTER Y # +1D5EC ; 0059 ; MA # ( 𝗬 → Y ) MATHEMATICAL SANS-SERIF BOLD CAPITAL Y → LATIN CAPITAL LETTER Y # +1D620 ; 0059 ; MA # ( 𝘠 → Y ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL Y → LATIN CAPITAL LETTER Y # +1D654 ; 0059 ; MA # ( 𝙔 → Y ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL Y → LATIN CAPITAL LETTER Y # +1D688 ; 0059 ; MA # ( 𝚈 → Y ) MATHEMATICAL MONOSPACE CAPITAL Y → LATIN CAPITAL LETTER Y # +03A5 ; 0059 ; MA # ( Υ → Y ) GREEK CAPITAL LETTER UPSILON → LATIN CAPITAL LETTER Y # +03D2 ; 0059 ; MA # ( ϒ → Y ) GREEK UPSILON WITH HOOK SYMBOL → LATIN CAPITAL LETTER Y # +1D6BC ; 0059 ; MA # ( 𝚼 → Y ) MATHEMATICAL BOLD CAPITAL UPSILON → LATIN CAPITAL LETTER Y # →Υ→ +1D6F6 ; 0059 ; MA # ( 𝛶 → Y ) MATHEMATICAL ITALIC CAPITAL UPSILON → LATIN CAPITAL LETTER Y # →Υ→ +1D730 ; 0059 ; MA # ( 𝜰 → Y ) MATHEMATICAL BOLD ITALIC CAPITAL UPSILON → LATIN CAPITAL LETTER Y # →Υ→ +1D76A ; 0059 ; MA # ( 𝝪 → Y ) MATHEMATICAL SANS-SERIF BOLD CAPITAL UPSILON → LATIN CAPITAL LETTER Y # →Υ→ +1D7A4 ; 0059 ; MA # ( 𝞤 → Y ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL UPSILON → LATIN CAPITAL LETTER Y # →Υ→ +2CA8 ; 0059 ; MA # ( Ⲩ → Y ) COPTIC CAPITAL LETTER UA → LATIN CAPITAL LETTER Y # +0423 ; 0059 ; MA # ( У → Y ) CYRILLIC CAPITAL LETTER U → LATIN CAPITAL LETTER Y # +04AE ; 0059 ; MA # ( Ү → Y ) CYRILLIC CAPITAL LETTER STRAIGHT U → LATIN CAPITAL LETTER Y # +13A9 ; 0059 ; MA # ( Ꭹ → Y ) CHEROKEE LETTER GI → LATIN CAPITAL LETTER Y # +13BD ; 0059 ; MA # ( Ꮍ → Y ) CHEROKEE LETTER MU → LATIN CAPITAL LETTER Y # →Ꭹ→ +A4EC ; 0059 ; MA # ( ꓬ → Y ) LISU LETTER YA → LATIN CAPITAL LETTER Y # +16F43 ; 0059 ; MA # ( 𖽃 → Y ) MIAO LETTER AH → LATIN CAPITAL LETTER Y # +118A4 ; 0059 ; MA # ( 𑢤 → Y ) WARANG CITI CAPITAL LETTER YA → LATIN CAPITAL LETTER Y # +102B2 ; 0059 ; MA # ( 𐊲 → Y ) CARIAN LETTER U → LATIN CAPITAL LETTER Y # + +01B4 ; 0079 0314 ; MA # ( ƴ → y̔ ) LATIN SMALL LETTER Y WITH HOOK → LATIN SMALL LETTER Y, COMBINING REVERSED COMMA ABOVE # + +024F ; 0079 0335 ; MA # ( ɏ → y̵ ) LATIN SMALL LETTER Y WITH STROKE → LATIN SMALL LETTER Y, COMBINING SHORT STROKE OVERLAY # +04B1 ; 0079 0335 ; MA # ( ұ → y̵ ) CYRILLIC SMALL LETTER STRAIGHT U WITH STROKE → LATIN SMALL LETTER Y, COMBINING SHORT STROKE OVERLAY # →ү̵→ + +00A5 ; 0059 0335 ; MA #* ( ¥ → Y̵ ) YEN SIGN → LATIN CAPITAL LETTER Y, COMBINING SHORT STROKE OVERLAY # +024E ; 0059 0335 ; MA # ( Ɏ → Y̵ ) LATIN CAPITAL LETTER Y WITH STROKE → LATIN CAPITAL LETTER Y, COMBINING SHORT STROKE OVERLAY # +04B0 ; 0059 0335 ; MA # ( Ұ → Y̵ ) CYRILLIC CAPITAL LETTER STRAIGHT U WITH STROKE → LATIN CAPITAL LETTER Y, COMBINING SHORT STROKE OVERLAY # →Ү̵→ + +0292 ; 021D ; MA # ( ʒ → ȝ ) LATIN SMALL LETTER EZH → LATIN SMALL LETTER YOGH # +A76B ; 021D ; MA # ( ꝫ → ȝ ) LATIN SMALL LETTER ET → LATIN SMALL LETTER YOGH # +2CC5 ; 021D ; MA # ( ⳅ → ȝ ) COPTIC SMALL LETTER OLD COPTIC SHEI → LATIN SMALL LETTER YOGH # →ʒ→ +2CCD ; 021D ; MA # ( ⳍ → ȝ ) COPTIC SMALL LETTER OLD COPTIC HORI → LATIN SMALL LETTER YOGH # +04E1 ; 021D ; MA # ( ӡ → ȝ ) CYRILLIC SMALL LETTER ABKHASIAN DZE → LATIN SMALL LETTER YOGH # →ʒ→ +10F3 ; 021D ; MA # ( ჳ → ȝ ) GEORGIAN LETTER WE → LATIN SMALL LETTER YOGH # →ʒ→ + +1D433 ; 007A ; MA # ( 𝐳 → z ) MATHEMATICAL BOLD SMALL Z → LATIN SMALL LETTER Z # +1D467 ; 007A ; MA # ( 𝑧 → z ) MATHEMATICAL ITALIC SMALL Z → LATIN SMALL LETTER Z # +1D49B ; 007A ; MA # ( 𝒛 → z ) MATHEMATICAL BOLD ITALIC SMALL Z → LATIN SMALL LETTER Z # +1D4CF ; 007A ; MA # ( 𝓏 → z ) MATHEMATICAL SCRIPT SMALL Z → LATIN SMALL LETTER Z # +1D503 ; 007A ; MA # ( 𝔃 → z ) MATHEMATICAL BOLD SCRIPT SMALL Z → LATIN SMALL LETTER Z # +1D537 ; 007A ; MA # ( 𝔷 → z ) MATHEMATICAL FRAKTUR SMALL Z → LATIN SMALL LETTER Z # +1D56B ; 007A ; MA # ( 𝕫 → z ) MATHEMATICAL DOUBLE-STRUCK SMALL Z → LATIN SMALL LETTER Z # +1D59F ; 007A ; MA # ( 𝖟 → z ) MATHEMATICAL BOLD FRAKTUR SMALL Z → LATIN SMALL LETTER Z # +1D5D3 ; 007A ; MA # ( 𝗓 → z ) MATHEMATICAL SANS-SERIF SMALL Z → LATIN SMALL LETTER Z # +1D607 ; 007A ; MA # ( 𝘇 → z ) MATHEMATICAL SANS-SERIF BOLD SMALL Z → LATIN SMALL LETTER Z # +1D63B ; 007A ; MA # ( 𝘻 → z ) MATHEMATICAL SANS-SERIF ITALIC SMALL Z → LATIN SMALL LETTER Z # +1D66F ; 007A ; MA # ( 𝙯 → z ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL Z → LATIN SMALL LETTER Z # +1D6A3 ; 007A ; MA # ( 𝚣 → z ) MATHEMATICAL MONOSPACE SMALL Z → LATIN SMALL LETTER Z # +1D22 ; 007A ; MA # ( ᴢ → z ) LATIN LETTER SMALL CAPITAL Z → LATIN SMALL LETTER Z # +AB93 ; 007A ; MA # ( ꮓ → z ) CHEROKEE SMALL LETTER NO → LATIN SMALL LETTER Z # →ᴢ→ +118C4 ; 007A ; MA # ( 𑣄 → z ) WARANG CITI SMALL LETTER YA → LATIN SMALL LETTER Z # + +118E5 ; 005A ; MA # ( 𑣥 → Z ) WARANG CITI DIGIT FIVE → LATIN CAPITAL LETTER Z # +102F5 ; 005A ; MA #* ( 𐋵 → Z ) COPTIC EPACT NUMBER THREE HUNDRED → LATIN CAPITAL LETTER Z # +FF3A ; 005A ; MA # ( Z → Z ) FULLWIDTH LATIN CAPITAL LETTER Z → LATIN CAPITAL LETTER Z # →Ζ→ +2124 ; 005A ; MA # ( ℤ → Z ) DOUBLE-STRUCK CAPITAL Z → LATIN CAPITAL LETTER Z # +2128 ; 005A ; MA # ( ℨ → Z ) BLACK-LETTER CAPITAL Z → LATIN CAPITAL LETTER Z # +1CCEF ; 005A ; MA #* ( 𜳯 → Z ) OUTLINED LATIN CAPITAL LETTER Z → LATIN CAPITAL LETTER Z # +1D419 ; 005A ; MA # ( 𝐙 → Z ) MATHEMATICAL BOLD CAPITAL Z → LATIN CAPITAL LETTER Z # +1D44D ; 005A ; MA # ( 𝑍 → Z ) MATHEMATICAL ITALIC CAPITAL Z → LATIN CAPITAL LETTER Z # +1D481 ; 005A ; MA # ( 𝒁 → Z ) MATHEMATICAL BOLD ITALIC CAPITAL Z → LATIN CAPITAL LETTER Z # +1D4B5 ; 005A ; MA # ( 𝒵 → Z ) MATHEMATICAL SCRIPT CAPITAL Z → LATIN CAPITAL LETTER Z # +1D4E9 ; 005A ; MA # ( 𝓩 → Z ) MATHEMATICAL BOLD SCRIPT CAPITAL Z → LATIN CAPITAL LETTER Z # +1D585 ; 005A ; MA # ( 𝖅 → Z ) MATHEMATICAL BOLD FRAKTUR CAPITAL Z → LATIN CAPITAL LETTER Z # +1D5B9 ; 005A ; MA # ( 𝖹 → Z ) MATHEMATICAL SANS-SERIF CAPITAL Z → LATIN CAPITAL LETTER Z # +1D5ED ; 005A ; MA # ( 𝗭 → Z ) MATHEMATICAL SANS-SERIF BOLD CAPITAL Z → LATIN CAPITAL LETTER Z # +1D621 ; 005A ; MA # ( 𝘡 → Z ) MATHEMATICAL SANS-SERIF ITALIC CAPITAL Z → LATIN CAPITAL LETTER Z # +1D655 ; 005A ; MA # ( 𝙕 → Z ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL Z → LATIN CAPITAL LETTER Z # +1D689 ; 005A ; MA # ( 𝚉 → Z ) MATHEMATICAL MONOSPACE CAPITAL Z → LATIN CAPITAL LETTER Z # +0396 ; 005A ; MA # ( Ζ → Z ) GREEK CAPITAL LETTER ZETA → LATIN CAPITAL LETTER Z # +1D6AD ; 005A ; MA # ( 𝚭 → Z ) MATHEMATICAL BOLD CAPITAL ZETA → LATIN CAPITAL LETTER Z # →Ζ→ +1D6E7 ; 005A ; MA # ( 𝛧 → Z ) MATHEMATICAL ITALIC CAPITAL ZETA → LATIN CAPITAL LETTER Z # →𝑍→ +1D721 ; 005A ; MA # ( 𝜡 → Z ) MATHEMATICAL BOLD ITALIC CAPITAL ZETA → LATIN CAPITAL LETTER Z # →Ζ→ +1D75B ; 005A ; MA # ( 𝝛 → Z ) MATHEMATICAL SANS-SERIF BOLD CAPITAL ZETA → LATIN CAPITAL LETTER Z # →Ζ→ +1D795 ; 005A ; MA # ( 𝞕 → Z ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL ZETA → LATIN CAPITAL LETTER Z # →Ζ→ +13C3 ; 005A ; MA # ( Ꮓ → Z ) CHEROKEE LETTER NO → LATIN CAPITAL LETTER Z # +A4DC ; 005A ; MA # ( ꓜ → Z ) LISU LETTER DZA → LATIN CAPITAL LETTER Z # +118A9 ; 005A ; MA # ( 𑢩 → Z ) WARANG CITI CAPITAL LETTER O → LATIN CAPITAL LETTER Z # + +0290 ; 007A 0328 ; MA # ( ʐ → z̨ ) LATIN SMALL LETTER Z WITH RETROFLEX HOOK → LATIN SMALL LETTER Z, COMBINING OGONEK # →z̢→ + +01B6 ; 007A 0335 ; MA # ( ƶ → z̵ ) LATIN SMALL LETTER Z WITH STROKE → LATIN SMALL LETTER Z, COMBINING SHORT STROKE OVERLAY # + +01B5 ; 005A 0335 ; MA # ( Ƶ → Z̵ ) LATIN CAPITAL LETTER Z WITH STROKE → LATIN CAPITAL LETTER Z, COMBINING SHORT STROKE OVERLAY # + +0225 ; 007A 0326 ; MA # ( ȥ → z̦ ) LATIN SMALL LETTER Z WITH HOOK → LATIN SMALL LETTER Z, COMBINING COMMA BELOW # →z̡→ + +0224 ; 005A 0326 ; MA # ( Ȥ → Z̦ ) LATIN CAPITAL LETTER Z WITH HOOK → LATIN CAPITAL LETTER Z, COMBINING COMMA BELOW # →Z̧→ + +1D76 ; 007A 0334 ; MA # ( ᵶ → z̴ ) LATIN SMALL LETTER Z WITH MIDDLE TILDE → LATIN SMALL LETTER Z, COMBINING TILDE OVERLAY # + +2C8D ; 2C6C ; MA # ( ⲍ → ⱬ ) COPTIC SMALL LETTER ZATA → LATIN SMALL LETTER Z WITH DESCENDER # + +2C8C ; 2C6B ; MA # ( Ⲍ → Ⱬ ) COPTIC CAPITAL LETTER ZATA → LATIN CAPITAL LETTER Z WITH DESCENDER # + +2C9D ; 0293 ; MA # ( ⲝ → ʓ ) COPTIC SMALL LETTER KSI → LATIN SMALL LETTER EZH WITH CURL # + +03F7 ; 00DE ; MA # ( Ϸ → Þ ) GREEK CAPITAL LETTER SHO → LATIN CAPITAL LETTER THORN # +104C4 ; 00DE ; MA # ( 𐓄 → Þ ) OSAGE CAPITAL LETTER PA → LATIN CAPITAL LETTER THORN # + +A7D2 ; A7D3 ; MA # ( ꟒ → ꟓ ) LATIN CAPITAL LETTER DOUBLE THORN → LATIN SMALL LETTER DOUBLE THORN # + +A7D4 ; A7D5 ; MA # ( ꟔ → ꟕ ) LATIN CAPITAL LETTER DOUBLE WYNN → LATIN SMALL LETTER DOUBLE WYNN # + +2079 ; A770 ; MA #* ( ⁹ → ꝰ ) SUPERSCRIPT NINE → MODIFIER LETTER US # + +1D24 ; 01A8 ; MA # ( ᴤ → ƨ ) LATIN LETTER VOICED LARYNGEAL SPIRANT → LATIN SMALL LETTER TONE TWO # +03E9 ; 01A8 ; MA # ( ϩ → ƨ ) COPTIC SMALL LETTER HORI → LATIN SMALL LETTER TONE TWO # +A645 ; 01A8 ; MA # ( ꙅ → ƨ ) CYRILLIC SMALL LETTER REVERSED DZE → LATIN SMALL LETTER TONE TWO # + +044C ; 0185 ; MA # ( ь → ƅ ) CYRILLIC SMALL LETTER SOFT SIGN → LATIN SMALL LETTER TONE SIX # +AB9F ; 0185 ; MA # ( ꮟ → ƅ ) CHEROKEE SMALL LETTER SI → LATIN SMALL LETTER TONE SIX # →ь→ +16ED1 ; 0185 ; MA # ( 𖻑 → ƅ ) BERIA ERFE SMALL LETTER UI → LATIN SMALL LETTER TONE SIX # →ь→ + +044B ; 0185 0069 ; MA # ( ы → ƅi ) CYRILLIC SMALL LETTER YERU → LATIN SMALL LETTER TONE SIX, LATIN SMALL LETTER I # →ьı→ + +AB7E ; 0242 ; MA # ( ꭾ → ɂ ) CHEROKEE SMALL LETTER HE → LATIN SMALL LETTER GLOTTAL STOP # + +02E4 ; 02C1 ; MA # ( ˤ → ˁ ) MODIFIER LETTER SMALL REVERSED GLOTTAL STOP → MODIFIER LETTER REVERSED GLOTTAL STOP # + +A6CD ; 02A1 ; MA # ( ꛍ → ʡ ) BAMUM LETTER LU → LATIN LETTER GLOTTAL STOP WITH STROKE # + +256A ; 01C2 ; MA #* ( ╪ → ǂ ) BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE → LATIN LETTER ALVEOLAR CLICK # + +2299 ; 0298 ; MA #* ( ⊙ → ʘ ) CIRCLED DOT OPERATOR → LATIN LETTER BILABIAL CLICK # +2609 ; 0298 ; MA #* ( ☉ → ʘ ) SUN → LATIN LETTER BILABIAL CLICK # →⊙→ +2A00 ; 0298 ; MA #* ( ⨀ → ʘ ) N-ARY CIRCLED DOT OPERATOR → LATIN LETTER BILABIAL CLICK # →⊙→ +A668 ; 0298 ; MA # ( Ꙩ → ʘ ) CYRILLIC CAPITAL LETTER MONOCULAR O → LATIN LETTER BILABIAL CLICK # +2D59 ; 0298 ; MA # ( ⵙ → ʘ ) TIFINAGH LETTER YAS → LATIN LETTER BILABIAL CLICK # →⊙→ +104C3 ; 0298 ; MA # ( 𐓃 → ʘ ) OSAGE CAPITAL LETTER OIN → LATIN LETTER BILABIAL CLICK # →Ꙩ→ + +213E ; 0393 ; MA # ( ℾ → Γ ) DOUBLE-STRUCK CAPITAL GAMMA → GREEK CAPITAL LETTER GAMMA # +1D6AA ; 0393 ; MA # ( 𝚪 → Γ ) MATHEMATICAL BOLD CAPITAL GAMMA → GREEK CAPITAL LETTER GAMMA # +1D6E4 ; 0393 ; MA # ( 𝛤 → Γ ) MATHEMATICAL ITALIC CAPITAL GAMMA → GREEK CAPITAL LETTER GAMMA # +1D71E ; 0393 ; MA # ( 𝜞 → Γ ) MATHEMATICAL BOLD ITALIC CAPITAL GAMMA → GREEK CAPITAL LETTER GAMMA # +1D758 ; 0393 ; MA # ( 𝝘 → Γ ) MATHEMATICAL SANS-SERIF BOLD CAPITAL GAMMA → GREEK CAPITAL LETTER GAMMA # +1D792 ; 0393 ; MA # ( 𝞒 → Γ ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL GAMMA → GREEK CAPITAL LETTER GAMMA # +2C84 ; 0393 ; MA # ( Ⲅ → Γ ) COPTIC CAPITAL LETTER GAMMA → GREEK CAPITAL LETTER GAMMA # +0413 ; 0393 ; MA # ( Г → Γ ) CYRILLIC CAPITAL LETTER GHE → GREEK CAPITAL LETTER GAMMA # +13B1 ; 0393 ; MA # ( Ꮁ → Γ ) CHEROKEE LETTER HU → GREEK CAPITAL LETTER GAMMA # +14A5 ; 0393 ; MA # ( ᒥ → Γ ) CANADIAN SYLLABICS MI → GREEK CAPITAL LETTER GAMMA # +16F07 ; 0393 ; MA # ( 𖼇 → Γ ) MIAO LETTER FA → GREEK CAPITAL LETTER GAMMA # + +0492 ; 0393 0335 ; MA # ( Ғ → Γ̵ ) CYRILLIC CAPITAL LETTER GHE WITH STROKE → GREEK CAPITAL LETTER GAMMA, COMBINING SHORT STROKE OVERLAY # →Г̵→ + +14AF ; 0393 00B7 ; MA # ( ᒯ → Γ· ) CANADIAN SYLLABICS WEST-CREE MWI → GREEK CAPITAL LETTER GAMMA, MIDDLE DOT # →ᒥᐧ→→ᒥ·→ + +0490 ; 0393 0027 ; MA # ( Ґ → Γ' ) CYRILLIC CAPITAL LETTER GHE WITH UPTURN → GREEK CAPITAL LETTER GAMMA, APOSTROPHE # →Гˈ→ + +2206 ; 0394 ; MA #* ( ∆ → Δ ) INCREMENT → GREEK CAPITAL LETTER DELTA # +25B3 ; 0394 ; MA #* ( △ → Δ ) WHITE UP-POINTING TRIANGLE → GREEK CAPITAL LETTER DELTA # +1F702 ; 0394 ; MA #* ( 🜂 → Δ ) ALCHEMICAL SYMBOL FOR FIRE → GREEK CAPITAL LETTER DELTA # →△→ +1D6AB ; 0394 ; MA # ( 𝚫 → Δ ) MATHEMATICAL BOLD CAPITAL DELTA → GREEK CAPITAL LETTER DELTA # +1D6E5 ; 0394 ; MA # ( 𝛥 → Δ ) MATHEMATICAL ITALIC CAPITAL DELTA → GREEK CAPITAL LETTER DELTA # +1D71F ; 0394 ; MA # ( 𝜟 → Δ ) MATHEMATICAL BOLD ITALIC CAPITAL DELTA → GREEK CAPITAL LETTER DELTA # +1D759 ; 0394 ; MA # ( 𝝙 → Δ ) MATHEMATICAL SANS-SERIF BOLD CAPITAL DELTA → GREEK CAPITAL LETTER DELTA # +1D793 ; 0394 ; MA # ( 𝞓 → Δ ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL DELTA → GREEK CAPITAL LETTER DELTA # +2C86 ; 0394 ; MA # ( Ⲇ → Δ ) COPTIC CAPITAL LETTER DALDA → GREEK CAPITAL LETTER DELTA # +2D60 ; 0394 ; MA # ( ⵠ → Δ ) TIFINAGH LETTER YAV → GREEK CAPITAL LETTER DELTA # +1403 ; 0394 ; MA # ( ᐃ → Δ ) CANADIAN SYLLABICS I → GREEK CAPITAL LETTER DELTA # +16F1A ; 0394 ; MA # ( 𖼚 → Δ ) MIAO LETTER TLHA → GREEK CAPITAL LETTER DELTA # +10285 ; 0394 ; MA # ( 𐊅 → Δ ) LYCIAN LETTER D → GREEK CAPITAL LETTER DELTA # +102A3 ; 0394 ; MA # ( 𐊣 → Δ ) CARIAN LETTER L → GREEK CAPITAL LETTER DELTA # + +2359 ; 0394 0332 ; MA #* ( ⍙ → Δ̲ ) APL FUNCTIONAL SYMBOL DELTA UNDERBAR → GREEK CAPITAL LETTER DELTA, COMBINING LOW LINE # + +140F ; 0394 00B7 ; MA # ( ᐏ → Δ· ) CANADIAN SYLLABICS WEST-CREE WI → GREEK CAPITAL LETTER DELTA, MIDDLE DOT # →ᐃᐧ→ + +142C ; 0394 1420 ; MA # ( ᐬ → Δᐠ ) CANADIAN SYLLABICS IN → GREEK CAPITAL LETTER DELTA, CANADIAN SYLLABICS FINAL GRAVE # →ᐃᐠ→ + +1D7CB ; 03DD ; MA # ( 𝟋 → ϝ ) MATHEMATICAL BOLD SMALL DIGAMMA → GREEK SMALL LETTER DIGAMMA # + +1D6C7 ; 03B6 ; MA # ( 𝛇 → ζ ) MATHEMATICAL BOLD SMALL ZETA → GREEK SMALL LETTER ZETA # +1D701 ; 03B6 ; MA # ( 𝜁 → ζ ) MATHEMATICAL ITALIC SMALL ZETA → GREEK SMALL LETTER ZETA # +1D73B ; 03B6 ; MA # ( 𝜻 → ζ ) MATHEMATICAL BOLD ITALIC SMALL ZETA → GREEK SMALL LETTER ZETA # +1D775 ; 03B6 ; MA # ( 𝝵 → ζ ) MATHEMATICAL SANS-SERIF BOLD SMALL ZETA → GREEK SMALL LETTER ZETA # +1D7AF ; 03B6 ; MA # ( 𝞯 → ζ ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL ZETA → GREEK SMALL LETTER ZETA # + +2CE4 ; 03D7 ; MA # ( ⳤ → ϗ ) COPTIC SYMBOL KAI → GREEK KAI SYMBOL # + +A7DB ; 03BB ; MA # ( ꟛ → λ ) LATIN SMALL LETTER LAMBDA → GREEK SMALL LETTER LAMDA # +1D6CC ; 03BB ; MA # ( 𝛌 → λ ) MATHEMATICAL BOLD SMALL LAMDA → GREEK SMALL LETTER LAMDA # +1D706 ; 03BB ; MA # ( 𝜆 → λ ) MATHEMATICAL ITALIC SMALL LAMDA → GREEK SMALL LETTER LAMDA # +1D740 ; 03BB ; MA # ( 𝝀 → λ ) MATHEMATICAL BOLD ITALIC SMALL LAMDA → GREEK SMALL LETTER LAMDA # +1D77A ; 03BB ; MA # ( 𝝺 → λ ) MATHEMATICAL SANS-SERIF BOLD SMALL LAMDA → GREEK SMALL LETTER LAMDA # +1D7B4 ; 03BB ; MA # ( 𝞴 → λ ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL LAMDA → GREEK SMALL LETTER LAMDA # +2C96 ; 03BB ; MA # ( Ⲗ → λ ) COPTIC CAPITAL LETTER LAULA → GREEK SMALL LETTER LAMDA # +104DB ; 03BB ; MA # ( 𐓛 → λ ) OSAGE SMALL LETTER AH → GREEK SMALL LETTER LAMDA # + +019B ; 03BB 0338 ; MA # ( ƛ → λ̸ ) LATIN SMALL LETTER LAMBDA WITH STROKE → GREEK SMALL LETTER LAMDA, COMBINING LONG SOLIDUS OVERLAY # →λ̷→ + +00B5 ; 03BC ; MA # ( µ → μ ) MICRO SIGN → GREEK SMALL LETTER MU # +1D6CD ; 03BC ; MA # ( 𝛍 → μ ) MATHEMATICAL BOLD SMALL MU → GREEK SMALL LETTER MU # +1D707 ; 03BC ; MA # ( 𝜇 → μ ) MATHEMATICAL ITALIC SMALL MU → GREEK SMALL LETTER MU # +1D741 ; 03BC ; MA # ( 𝝁 → μ ) MATHEMATICAL BOLD ITALIC SMALL MU → GREEK SMALL LETTER MU # +1D77B ; 03BC ; MA # ( 𝝻 → μ ) MATHEMATICAL SANS-SERIF BOLD SMALL MU → GREEK SMALL LETTER MU # +1D7B5 ; 03BC ; MA # ( 𝞵 → μ ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL MU → GREEK SMALL LETTER MU # + +1D6CF ; 03BE ; MA # ( 𝛏 → ξ ) MATHEMATICAL BOLD SMALL XI → GREEK SMALL LETTER XI # +1D709 ; 03BE ; MA # ( 𝜉 → ξ ) MATHEMATICAL ITALIC SMALL XI → GREEK SMALL LETTER XI # +1D743 ; 03BE ; MA # ( 𝝃 → ξ ) MATHEMATICAL BOLD ITALIC SMALL XI → GREEK SMALL LETTER XI # +1D77D ; 03BE ; MA # ( 𝝽 → ξ ) MATHEMATICAL SANS-SERIF BOLD SMALL XI → GREEK SMALL LETTER XI # +1D7B7 ; 03BE ; MA # ( 𝞷 → ξ ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL XI → GREEK SMALL LETTER XI # + +2630 ; 039E ; MA #* ( ☰ → Ξ ) TRIGRAM FOR HEAVEN → GREEK CAPITAL LETTER XI # →Ⲷ→ +1D6B5 ; 039E ; MA # ( 𝚵 → Ξ ) MATHEMATICAL BOLD CAPITAL XI → GREEK CAPITAL LETTER XI # +1D6EF ; 039E ; MA # ( 𝛯 → Ξ ) MATHEMATICAL ITALIC CAPITAL XI → GREEK CAPITAL LETTER XI # +1D729 ; 039E ; MA # ( 𝜩 → Ξ ) MATHEMATICAL BOLD ITALIC CAPITAL XI → GREEK CAPITAL LETTER XI # +1D763 ; 039E ; MA # ( 𝝣 → Ξ ) MATHEMATICAL SANS-SERIF BOLD CAPITAL XI → GREEK CAPITAL LETTER XI # +1D79D ; 039E ; MA # ( 𝞝 → Ξ ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL XI → GREEK CAPITAL LETTER XI # +2CB6 ; 039E ; MA # ( Ⲷ → Ξ ) COPTIC CAPITAL LETTER CRYPTOGRAMMIC EIE → GREEK CAPITAL LETTER XI # + +03D6 ; 03C0 ; MA # ( ϖ → π ) GREEK PI SYMBOL → GREEK SMALL LETTER PI # +213C ; 03C0 ; MA # ( ℼ → π ) DOUBLE-STRUCK SMALL PI → GREEK SMALL LETTER PI # +1D6D1 ; 03C0 ; MA # ( 𝛑 → π ) MATHEMATICAL BOLD SMALL PI → GREEK SMALL LETTER PI # +1D6E1 ; 03C0 ; MA # ( 𝛡 → π ) MATHEMATICAL BOLD PI SYMBOL → GREEK SMALL LETTER PI # +1D70B ; 03C0 ; MA # ( 𝜋 → π ) MATHEMATICAL ITALIC SMALL PI → GREEK SMALL LETTER PI # +1D71B ; 03C0 ; MA # ( 𝜛 → π ) MATHEMATICAL ITALIC PI SYMBOL → GREEK SMALL LETTER PI # +1D745 ; 03C0 ; MA # ( 𝝅 → π ) MATHEMATICAL BOLD ITALIC SMALL PI → GREEK SMALL LETTER PI # +1D755 ; 03C0 ; MA # ( 𝝕 → π ) MATHEMATICAL BOLD ITALIC PI SYMBOL → GREEK SMALL LETTER PI # +1D77F ; 03C0 ; MA # ( 𝝿 → π ) MATHEMATICAL SANS-SERIF BOLD SMALL PI → GREEK SMALL LETTER PI # +1D78F ; 03C0 ; MA # ( 𝞏 → π ) MATHEMATICAL SANS-SERIF BOLD PI SYMBOL → GREEK SMALL LETTER PI # +1D7B9 ; 03C0 ; MA # ( 𝞹 → π ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL PI → GREEK SMALL LETTER PI # +1D7C9 ; 03C0 ; MA # ( 𝟉 → π ) MATHEMATICAL SANS-SERIF BOLD ITALIC PI SYMBOL → GREEK SMALL LETTER PI # +1D28 ; 03C0 ; MA # ( ᴨ → π ) GREEK LETTER SMALL CAPITAL PI → GREEK SMALL LETTER PI # →п→ +2CA1 ; 03C0 ; MA # ( ⲡ → π ) COPTIC SMALL LETTER PI → GREEK SMALL LETTER PI # →п→ +043F ; 03C0 ; MA # ( п → π ) CYRILLIC SMALL LETTER PE → GREEK SMALL LETTER PI # +16EC1 ; 03C0 ; MA # ( 𖻁 → π ) BERIA ERFE SMALL LETTER HIRDEABO → GREEK SMALL LETTER PI # →п→ + +220F ; 03A0 ; MA #* ( ∏ → Π ) N-ARY PRODUCT → GREEK CAPITAL LETTER PI # +213F ; 03A0 ; MA # ( ℿ → Π ) DOUBLE-STRUCK CAPITAL PI → GREEK CAPITAL LETTER PI # +1D6B7 ; 03A0 ; MA # ( 𝚷 → Π ) MATHEMATICAL BOLD CAPITAL PI → GREEK CAPITAL LETTER PI # +1D6F1 ; 03A0 ; MA # ( 𝛱 → Π ) MATHEMATICAL ITALIC CAPITAL PI → GREEK CAPITAL LETTER PI # +1D72B ; 03A0 ; MA # ( 𝜫 → Π ) MATHEMATICAL BOLD ITALIC CAPITAL PI → GREEK CAPITAL LETTER PI # +1D765 ; 03A0 ; MA # ( 𝝥 → Π ) MATHEMATICAL SANS-SERIF BOLD CAPITAL PI → GREEK CAPITAL LETTER PI # +1D79F ; 03A0 ; MA # ( 𝞟 → Π ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL PI → GREEK CAPITAL LETTER PI # +2CA0 ; 03A0 ; MA # ( Ⲡ → Π ) COPTIC CAPITAL LETTER PI → GREEK CAPITAL LETTER PI # +041F ; 03A0 ; MA # ( П → Π ) CYRILLIC CAPITAL LETTER PE → GREEK CAPITAL LETTER PI # +A6DB ; 03A0 ; MA # ( ꛛ → Π ) BAMUM LETTER NA → GREEK CAPITAL LETTER PI # +16EA6 ; 03A0 ; MA # ( 𖺦 → Π ) BERIA ERFE CAPITAL LETTER HIRDEABO → GREEK CAPITAL LETTER PI # →П→ + +102AD ; 03D8 ; MA # ( 𐊭 → Ϙ ) CARIAN LETTER T → GREEK LETTER ARCHAIC KOPPA # +10312 ; 03D8 ; MA # ( 𐌒 → Ϙ ) OLD ITALIC LETTER KU → GREEK LETTER ARCHAIC KOPPA # + +2CC1 ; 03FC ; MA # ( ⳁ → ϼ ) COPTIC SMALL LETTER SAMPI → GREEK RHO WITH STROKE SYMBOL # + +03DB ; 03C2 ; MA # ( ϛ → ς ) GREEK SMALL LETTER STIGMA → GREEK SMALL LETTER FINAL SIGMA # +1D6D3 ; 03C2 ; MA # ( 𝛓 → ς ) MATHEMATICAL BOLD SMALL FINAL SIGMA → GREEK SMALL LETTER FINAL SIGMA # +1D70D ; 03C2 ; MA # ( 𝜍 → ς ) MATHEMATICAL ITALIC SMALL FINAL SIGMA → GREEK SMALL LETTER FINAL SIGMA # +1D747 ; 03C2 ; MA # ( 𝝇 → ς ) MATHEMATICAL BOLD ITALIC SMALL FINAL SIGMA → GREEK SMALL LETTER FINAL SIGMA # +1D781 ; 03C2 ; MA # ( 𝞁 → ς ) MATHEMATICAL SANS-SERIF BOLD SMALL FINAL SIGMA → GREEK SMALL LETTER FINAL SIGMA # +1D7BB ; 03C2 ; MA # ( 𝞻 → ς ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL FINAL SIGMA → GREEK SMALL LETTER FINAL SIGMA # +2C8B ; 03C2 ; MA # ( ⲋ → ς ) COPTIC SMALL LETTER SOU → GREEK SMALL LETTER FINAL SIGMA # + +1D6BD ; 03A6 ; MA # ( 𝚽 → Φ ) MATHEMATICAL BOLD CAPITAL PHI → GREEK CAPITAL LETTER PHI # +1D6F7 ; 03A6 ; MA # ( 𝛷 → Φ ) MATHEMATICAL ITALIC CAPITAL PHI → GREEK CAPITAL LETTER PHI # +1D731 ; 03A6 ; MA # ( 𝜱 → Φ ) MATHEMATICAL BOLD ITALIC CAPITAL PHI → GREEK CAPITAL LETTER PHI # +1D76B ; 03A6 ; MA # ( 𝝫 → Φ ) MATHEMATICAL SANS-SERIF BOLD CAPITAL PHI → GREEK CAPITAL LETTER PHI # +1D7A5 ; 03A6 ; MA # ( 𝞥 → Φ ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL PHI → GREEK CAPITAL LETTER PHI # +2CAA ; 03A6 ; MA # ( Ⲫ → Φ ) COPTIC CAPITAL LETTER FI → GREEK CAPITAL LETTER PHI # +0424 ; 03A6 ; MA # ( Ф → Φ ) CYRILLIC CAPITAL LETTER EF → GREEK CAPITAL LETTER PHI # +0553 ; 03A6 ; MA # ( Փ → Φ ) ARMENIAN CAPITAL LETTER PIWR → GREEK CAPITAL LETTER PHI # +1240 ; 03A6 ; MA # ( ቀ → Φ ) ETHIOPIC SYLLABLE QA → GREEK CAPITAL LETTER PHI # →Փ→ +16F0 ; 03A6 ; MA # ( ᛰ → Φ ) RUNIC BELGTHOR SYMBOL → GREEK CAPITAL LETTER PHI # +102B3 ; 03A6 ; MA # ( 𐊳 → Φ ) CARIAN LETTER NN → GREEK CAPITAL LETTER PHI # + +AB53 ; 03C7 ; MA # ( ꭓ → χ ) LATIN SMALL LETTER CHI → GREEK SMALL LETTER CHI # +AB55 ; 03C7 ; MA # ( ꭕ → χ ) LATIN SMALL LETTER CHI WITH LOW LEFT SERIF → GREEK SMALL LETTER CHI # +1D6D8 ; 03C7 ; MA # ( 𝛘 → χ ) MATHEMATICAL BOLD SMALL CHI → GREEK SMALL LETTER CHI # +1D712 ; 03C7 ; MA # ( 𝜒 → χ ) MATHEMATICAL ITALIC SMALL CHI → GREEK SMALL LETTER CHI # +1D74C ; 03C7 ; MA # ( 𝝌 → χ ) MATHEMATICAL BOLD ITALIC SMALL CHI → GREEK SMALL LETTER CHI # +1D786 ; 03C7 ; MA # ( 𝞆 → χ ) MATHEMATICAL SANS-SERIF BOLD SMALL CHI → GREEK SMALL LETTER CHI # +1D7C0 ; 03C7 ; MA # ( 𝟀 → χ ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL CHI → GREEK SMALL LETTER CHI # +2CAD ; 03C7 ; MA # ( ⲭ → χ ) COPTIC SMALL LETTER KHI → GREEK SMALL LETTER CHI # + +1D6D9 ; 03C8 ; MA # ( 𝛙 → ψ ) MATHEMATICAL BOLD SMALL PSI → GREEK SMALL LETTER PSI # +1D713 ; 03C8 ; MA # ( 𝜓 → ψ ) MATHEMATICAL ITALIC SMALL PSI → GREEK SMALL LETTER PSI # +1D74D ; 03C8 ; MA # ( 𝝍 → ψ ) MATHEMATICAL BOLD ITALIC SMALL PSI → GREEK SMALL LETTER PSI # +1D787 ; 03C8 ; MA # ( 𝞇 → ψ ) MATHEMATICAL SANS-SERIF BOLD SMALL PSI → GREEK SMALL LETTER PSI # +1D7C1 ; 03C8 ; MA # ( 𝟁 → ψ ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL PSI → GREEK SMALL LETTER PSI # +2CAF ; 03C8 ; MA # ( ⲯ → ψ ) COPTIC SMALL LETTER PSI → GREEK SMALL LETTER PSI # +0471 ; 03C8 ; MA # ( ѱ → ψ ) CYRILLIC SMALL LETTER PSI → GREEK SMALL LETTER PSI # +104F9 ; 03C8 ; MA # ( 𐓹 → ψ ) OSAGE SMALL LETTER GHA → GREEK SMALL LETTER PSI # + +1D6BF ; 03A8 ; MA # ( 𝚿 → Ψ ) MATHEMATICAL BOLD CAPITAL PSI → GREEK CAPITAL LETTER PSI # +1D6F9 ; 03A8 ; MA # ( 𝛹 → Ψ ) MATHEMATICAL ITALIC CAPITAL PSI → GREEK CAPITAL LETTER PSI # +1D733 ; 03A8 ; MA # ( 𝜳 → Ψ ) MATHEMATICAL BOLD ITALIC CAPITAL PSI → GREEK CAPITAL LETTER PSI # +1D76D ; 03A8 ; MA # ( 𝝭 → Ψ ) MATHEMATICAL SANS-SERIF BOLD CAPITAL PSI → GREEK CAPITAL LETTER PSI # +1D7A7 ; 03A8 ; MA # ( 𝞧 → Ψ ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL PSI → GREEK CAPITAL LETTER PSI # +2CAE ; 03A8 ; MA # ( Ⲯ → Ψ ) COPTIC CAPITAL LETTER PSI → GREEK CAPITAL LETTER PSI # +0470 ; 03A8 ; MA # ( Ѱ → Ψ ) CYRILLIC CAPITAL LETTER PSI → GREEK CAPITAL LETTER PSI # +104D1 ; 03A8 ; MA # ( 𐓑 → Ψ ) OSAGE CAPITAL LETTER GHA → GREEK CAPITAL LETTER PSI # +16D8 ; 03A8 ; MA # ( ᛘ → Ψ ) RUNIC LETTER LONG-BRANCH-MADR M → GREEK CAPITAL LETTER PSI # +102B5 ; 03A8 ; MA # ( 𐊵 → Ψ ) CARIAN LETTER N → GREEK CAPITAL LETTER PSI # + +2375 ; 03C9 ; MA #* ( ⍵ → ω ) APL FUNCTIONAL SYMBOL OMEGA → GREEK SMALL LETTER OMEGA # +A7B7 ; 03C9 ; MA # ( ꞷ → ω ) LATIN SMALL LETTER OMEGA → GREEK SMALL LETTER OMEGA # +1D6DA ; 03C9 ; MA # ( 𝛚 → ω ) MATHEMATICAL BOLD SMALL OMEGA → GREEK SMALL LETTER OMEGA # +1D714 ; 03C9 ; MA # ( 𝜔 → ω ) MATHEMATICAL ITALIC SMALL OMEGA → GREEK SMALL LETTER OMEGA # +1D74E ; 03C9 ; MA # ( 𝝎 → ω ) MATHEMATICAL BOLD ITALIC SMALL OMEGA → GREEK SMALL LETTER OMEGA # +1D788 ; 03C9 ; MA # ( 𝞈 → ω ) MATHEMATICAL SANS-SERIF BOLD SMALL OMEGA → GREEK SMALL LETTER OMEGA # +1D7C2 ; 03C9 ; MA # ( 𝟂 → ω ) MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL OMEGA → GREEK SMALL LETTER OMEGA # +2CB1 ; 03C9 ; MA # ( ⲱ → ω ) COPTIC SMALL LETTER OOU → GREEK SMALL LETTER OMEGA # +A64D ; 03C9 ; MA # ( ꙍ → ω ) CYRILLIC SMALL LETTER BROAD OMEGA → GREEK SMALL LETTER OMEGA # →ꞷ→ + +2126 ; 03A9 ; MA # ( Ω → Ω ) OHM SIGN → GREEK CAPITAL LETTER OMEGA # +1D6C0 ; 03A9 ; MA # ( 𝛀 → Ω ) MATHEMATICAL BOLD CAPITAL OMEGA → GREEK CAPITAL LETTER OMEGA # +1D6FA ; 03A9 ; MA # ( 𝛺 → Ω ) MATHEMATICAL ITALIC CAPITAL OMEGA → GREEK CAPITAL LETTER OMEGA # +1D734 ; 03A9 ; MA # ( 𝜴 → Ω ) MATHEMATICAL BOLD ITALIC CAPITAL OMEGA → GREEK CAPITAL LETTER OMEGA # +1D76E ; 03A9 ; MA # ( 𝝮 → Ω ) MATHEMATICAL SANS-SERIF BOLD CAPITAL OMEGA → GREEK CAPITAL LETTER OMEGA # +1D7A8 ; 03A9 ; MA # ( 𝞨 → Ω ) MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL OMEGA → GREEK CAPITAL LETTER OMEGA # +162F ; 03A9 ; MA # ( ᘯ → Ω ) CANADIAN SYLLABICS CARRIER LHO → GREEK CAPITAL LETTER OMEGA # +1635 ; 03A9 ; MA # ( ᘵ → Ω ) CANADIAN SYLLABICS CARRIER TLHO → GREEK CAPITAL LETTER OMEGA # →ᘯ→ +102B6 ; 03A9 ; MA # ( 𐊶 → Ω ) CARIAN LETTER TT2 → GREEK CAPITAL LETTER OMEGA # + +2379 ; 03C9 0332 ; MA #* ( ⍹ → ω̲ ) APL FUNCTIONAL SYMBOL OMEGA UNDERBAR → GREEK SMALL LETTER OMEGA, COMBINING LOW LINE # + +1F7D ; 1FF4 ; MA # ( ώ → ῴ ) GREEK SMALL LETTER OMEGA WITH OXIA → GREEK SMALL LETTER OMEGA WITH OXIA AND YPOGEGRAMMENI # + +0497 ; 0436 0329 ; MA # ( җ → ж̩ ) CYRILLIC SMALL LETTER ZHE WITH DESCENDER → CYRILLIC SMALL LETTER ZHE, COMBINING VERTICAL LINE BELOW # + +0496 ; 0416 0329 ; MA # ( Җ → Ж̩ ) CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER → CYRILLIC CAPITAL LETTER ZHE, COMBINING VERTICAL LINE BELOW # + +1D20B ; 0418 ; MA #* ( 𝈋 → И ) GREEK VOCAL NOTATION SYMBOL-12 → CYRILLIC CAPITAL LETTER I # →Ͷ→ +0376 ; 0418 ; MA # ( Ͷ → И ) GREEK CAPITAL LETTER PAMPHYLIAN DIGAMMA → CYRILLIC CAPITAL LETTER I # +A6A1 ; 0418 ; MA # ( ꚡ → И ) BAMUM LETTER KA → CYRILLIC CAPITAL LETTER I # →Ͷ→ +10425 ; 0418 ; MA # ( 𐐥 → И ) DESERET CAPITAL LETTER ENG → CYRILLIC CAPITAL LETTER I # + +0419 ; 040D ; MA # ( Й → Ѝ ) CYRILLIC CAPITAL LETTER SHORT I → CYRILLIC CAPITAL LETTER I WITH GRAVE # + +048A ; 040D 0326 ; MA # ( Ҋ → Ѝ̦ ) CYRILLIC CAPITAL LETTER SHORT I WITH TAIL → CYRILLIC CAPITAL LETTER I WITH GRAVE, COMBINING COMMA BELOW # →Й̡→ + +045D ; 0439 ; MA # ( ѝ → й ) CYRILLIC SMALL LETTER I WITH GRAVE → CYRILLIC SMALL LETTER SHORT I # + +048B ; 0439 0326 ; MA # ( ҋ → й̦ ) CYRILLIC SMALL LETTER SHORT I WITH TAIL → CYRILLIC SMALL LETTER SHORT I, COMBINING COMMA BELOW # →й̡→ + +104BC ; 04C3 ; MA # ( 𐒼 → Ӄ ) OSAGE CAPITAL LETTER KA → CYRILLIC CAPITAL LETTER KA WITH HOOK # + +1D2B ; 043B ; MA # ( ᴫ → л ) CYRILLIC LETTER SMALL CAPITAL EL → CYRILLIC SMALL LETTER EL # + +04C6 ; 043B 0326 ; MA # ( ӆ → л̦ ) CYRILLIC SMALL LETTER EL WITH TAIL → CYRILLIC SMALL LETTER EL, COMBINING COMMA BELOW # →л̡→ + +AB60 ; 0459 ; MA # ( ꭠ → љ ) LATIN SMALL LETTER SAKHA YAT → CYRILLIC SMALL LETTER LJE # + +104EB ; A669 ; MA # ( 𐓫 → ꙩ ) OSAGE SMALL LETTER OIN → CYRILLIC SMALL LETTER MONOCULAR O # + +1DEE ; 2DEC ; MA # ( ᷮ → ⷬ ) COMBINING LATIN SMALL LETTER P → COMBINING CYRILLIC LETTER ER # + +104CD ; 040B ; MA # ( 𐓍 → Ћ ) OSAGE CAPITAL LETTER DHA → CYRILLIC CAPITAL LETTER TSHE # + +1D202 ; 04FE ; MA #* ( 𝈂 → Ӿ ) GREEK VOCAL NOTATION SYMBOL-3 → CYRILLIC CAPITAL LETTER HA WITH STROKE # + +1D222 ; 0460 ; MA #* ( 𝈢 → Ѡ ) GREEK INSTRUMENTAL NOTATION SYMBOL-8 → CYRILLIC CAPITAL LETTER OMEGA # +13C7 ; 0460 ; MA # ( Ꮗ → Ѡ ) CHEROKEE LETTER QUE → CYRILLIC CAPITAL LETTER OMEGA # +15EF ; 0460 ; MA # ( ᗯ → Ѡ ) CANADIAN SYLLABICS CARRIER GU → CYRILLIC CAPITAL LETTER OMEGA # + +047C ; 0460 0486 0487 ; MA # ( Ѽ → Ѡ҆҇ ) CYRILLIC CAPITAL LETTER OMEGA WITH TITLO → CYRILLIC CAPITAL LETTER OMEGA, COMBINING CYRILLIC PSILI PNEUMATA, COMBINING CYRILLIC POKRYTIE # + +18ED ; 0460 00B7 ; MA # ( ᣭ → Ѡ· ) CANADIAN SYLLABICS CARRIER GWU → CYRILLIC CAPITAL LETTER OMEGA, MIDDLE DOT # →ᗯᐧ→ + +A7B6 ; A64C ; MA # ( Ꞷ → Ꙍ ) LATIN CAPITAL LETTER OMEGA → CYRILLIC CAPITAL LETTER BROAD OMEGA # +2CB0 ; A64C ; MA # ( Ⲱ → Ꙍ ) COPTIC CAPITAL LETTER OOU → CYRILLIC CAPITAL LETTER BROAD OMEGA # + +0AEB ; 0447 ; MA # ( ૫ → ч ) GUJARATI DIGIT FIVE → CYRILLIC SMALL LETTER CHE # +03E5 ; 0447 ; MA # ( ϥ → ч ) COPTIC SMALL LETTER FEI → CYRILLIC SMALL LETTER CHE # +0AAA ; 0447 ; MA # ( પ → ч ) GUJARATI LETTER PA → CYRILLIC SMALL LETTER CHE # →૫→ + +03E4 ; 0427 ; MA # ( Ϥ → Ч ) COPTIC CAPITAL LETTER FEI → CYRILLIC CAPITAL LETTER CHE # + +04CC ; 04B7 ; MA # ( ӌ → ҷ ) CYRILLIC SMALL LETTER KHAKASSIAN CHE → CYRILLIC SMALL LETTER CHE WITH DESCENDER # + +04CB ; 04B6 ; MA # ( Ӌ → Ҷ ) CYRILLIC CAPITAL LETTER KHAKASSIAN CHE → CYRILLIC CAPITAL LETTER CHE WITH DESCENDER # + +04BE ; 04BC 0328 ; MA # ( Ҿ → Ҽ̨ ) CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER → CYRILLIC CAPITAL LETTER ABKHASIAN CHE, COMBINING OGONEK # + +2CBC ; 0428 ; MA # ( Ⲽ → Ш ) COPTIC CAPITAL LETTER CRYPTOGRAMMIC NI → CYRILLIC CAPITAL LETTER SHA # + +A650 ; 042A 006C ; MA # ( Ꙑ → Ъl ) CYRILLIC CAPITAL LETTER YERU WITH BACK YER → CYRILLIC CAPITAL LETTER HARD SIGN, LATIN SMALL LETTER L # →ЪІ→ + +2108 ; 042D ; MA #* ( ℈ → Э ) SCRUPLE → CYRILLIC CAPITAL LETTER E # + +1F701 ; A658 ; MA #* ( 🜁 → Ꙙ ) ALCHEMICAL SYMBOL FOR AIR → CYRILLIC CAPITAL LETTER CLOSED LITTLE YUS # +16F1C ; A658 ; MA # ( 𖼜 → Ꙙ ) MIAO LETTER TLHYA → CYRILLIC CAPITAL LETTER CLOSED LITTLE YUS # + +A992 ; 2C3F ; MA # ( ꦒ → ⰿ ) JAVANESE LETTER GA → GLAGOLITIC SMALL LETTER MYSLITE # + +0587 ; 0565 0069 ; MA # ( և → եi ) ARMENIAN SMALL LIGATURE ECH YIWN → ARMENIAN SMALL LETTER ECH, LATIN SMALL LETTER I # →եւ→ + +1294 ; 0571 ; MA # ( ኔ → ձ ) ETHIOPIC SYLLABLE NEE → ARMENIAN SMALL LETTER JA # + +FB14 ; 0574 0565 ; MA # ( ﬔ → մե ) ARMENIAN SMALL LIGATURE MEN ECH → ARMENIAN SMALL LETTER MEN, ARMENIAN SMALL LETTER ECH # + +FB15 ; 0574 056B ; MA # ( ﬕ → մի ) ARMENIAN SMALL LIGATURE MEN INI → ARMENIAN SMALL LETTER MEN, ARMENIAN SMALL LETTER INI # + +FB17 ; 0574 056D ; MA # ( ﬗ → մխ ) ARMENIAN SMALL LIGATURE MEN XEH → ARMENIAN SMALL LETTER MEN, ARMENIAN SMALL LETTER XEH # + +FB13 ; 0574 0576 ; MA # ( ﬓ → մն ) ARMENIAN SMALL LIGATURE MEN NOW → ARMENIAN SMALL LETTER MEN, ARMENIAN SMALL LETTER NOW # + +2229 ; 0548 ; MA #* ( ∩ → Ո ) INTERSECTION → ARMENIAN CAPITAL LETTER VO # →ᑎ→ +22C2 ; 0548 ; MA #* ( ⋂ → Ո ) N-ARY INTERSECTION → ARMENIAN CAPITAL LETTER VO # →∩→→ᑎ→ +1D245 ; 0548 ; MA #* ( 𝉅 → Ո ) GREEK MUSICAL LEIMMA → ARMENIAN CAPITAL LETTER VO # →∩→→ᑎ→ +1260 ; 0548 ; MA # ( በ → Ո ) ETHIOPIC SYLLABLE BA → ARMENIAN CAPITAL LETTER VO # +144E ; 0548 ; MA # ( ᑎ → Ո ) CANADIAN SYLLABICS TI → ARMENIAN CAPITAL LETTER VO # +A4F5 ; 0548 ; MA # ( ꓵ → Ո ) LISU LETTER UE → ARMENIAN CAPITAL LETTER VO # →∩→→ᑎ→ + +145A ; 0548 00B7 ; MA # ( ᑚ → Ո· ) CANADIAN SYLLABICS WEST-CREE TWI → ARMENIAN CAPITAL LETTER VO, MIDDLE DOT # →ᑎᐧ→→ᑎ·→ + +1468 ; 0548 0027 ; MA # ( ᑨ → Ո' ) CANADIAN SYLLABICS TTI → ARMENIAN CAPITAL LETTER VO, APOSTROPHE # →ᑎᑊ→→ᑎ'→ + +FB16 ; 057E 0576 ; MA # ( ﬖ → վն ) ARMENIAN SMALL LIGATURE VEW NOW → ARMENIAN SMALL LETTER VEW, ARMENIAN SMALL LETTER NOW # + +2CE8 ; 0554 ; MA #* ( ⳨ → Ք ) COPTIC SYMBOL TAU RO → ARMENIAN CAPITAL LETTER KEH # →₽→ +101A0 ; 0554 ; MA #* ( 𐆠 → Ք ) GREEK SYMBOL TAU RHO → ARMENIAN CAPITAL LETTER KEH # →⳨→→₽→ +20BD ; 0554 ; MA #* ( ₽ → Ք ) RUBLE SIGN → ARMENIAN CAPITAL LETTER KEH # +2CC0 ; 0554 ; MA # ( Ⳁ → Ք ) COPTIC CAPITAL LETTER SAMPI → ARMENIAN CAPITAL LETTER KEH # →₽→ + +02D3 ; 0559 ; MA #* ( ˓ → ՙ ) MODIFIER LETTER CENTRED LEFT HALF RING → ARMENIAN MODIFIER LETTER LEFT HALF RING # +02BF ; 0559 ; MA # ( ʿ → ՙ ) MODIFIER LETTER LEFT HALF RING → ARMENIAN MODIFIER LETTER LEFT HALF RING # + +2135 ; 05D0 ; MA # ( ℵ → ‎א‎ ) ALEF SYMBOL → HEBREW LETTER ALEF # +FB21 ; 05D0 ; MA # ( ‎ﬡ‎ → ‎א‎ ) HEBREW LETTER WIDE ALEF → HEBREW LETTER ALEF # + +FB2F ; FB2E ; MA # ( ‎אָ‎ → ‎אַ‎ ) HEBREW LETTER ALEF WITH QAMATS → HEBREW LETTER ALEF WITH PATAH # +FB30 ; FB2E ; MA # ( ‎אּ‎ → ‎אַ‎ ) HEBREW LETTER ALEF WITH MAPIQ → HEBREW LETTER ALEF WITH PATAH # + +FB4F ; 05D0 05DC ; MA # ( ‎ﭏ‎ → ‎אל‎ ) HEBREW LIGATURE ALEF LAMED → HEBREW LETTER ALEF, HEBREW LETTER LAMED # + +2136 ; 05D1 ; MA # ( ℶ → ‎ב‎ ) BET SYMBOL → HEBREW LETTER BET # + +2137 ; 05D2 ; MA # ( ℷ → ‎ג‎ ) GIMEL SYMBOL → HEBREW LETTER GIMEL # + +2138 ; 05D3 ; MA # ( ℸ → ‎ד‎ ) DALET SYMBOL → HEBREW LETTER DALET # +FB22 ; 05D3 ; MA # ( ‎ﬢ‎ → ‎ד‎ ) HEBREW LETTER WIDE DALET → HEBREW LETTER DALET # + +FB23 ; 05D4 ; MA # ( ‎ﬣ‎ → ‎ה‎ ) HEBREW LETTER WIDE HE → HEBREW LETTER HE # + +FB39 ; FB1D ; MA # ( ‎יּ‎ → ‎יִ‎ ) HEBREW LETTER YOD WITH DAGESH → HEBREW LETTER YOD WITH HIRIQ # + +FB24 ; 05DB ; MA # ( ‎ﬤ‎ → ‎כ‎ ) HEBREW LETTER WIDE KAF → HEBREW LETTER KAF # + +FB25 ; 05DC ; MA # ( ‎ﬥ‎ → ‎ל‎ ) HEBREW LETTER WIDE LAMED → HEBREW LETTER LAMED # + +FB26 ; 05DD ; MA # ( ‎ﬦ‎ → ‎ם‎ ) HEBREW LETTER WIDE FINAL MEM → HEBREW LETTER FINAL MEM # + +FB20 ; 05E2 ; MA # ( ‎ﬠ‎ → ‎ע‎ ) HEBREW LETTER ALTERNATIVE AYIN → HEBREW LETTER AYIN # + +FB27 ; 05E8 ; MA # ( ‎ﬧ‎ → ‎ר‎ ) HEBREW LETTER WIDE RESH → HEBREW LETTER RESH # + +FB2B ; FB2A ; MA # ( ‎שׂ‎ → ‎שׁ‎ ) HEBREW LETTER SHIN WITH SIN DOT → HEBREW LETTER SHIN WITH SHIN DOT # +FB49 ; FB2A ; MA # ( ‎שּ‎ → ‎שׁ‎ ) HEBREW LETTER SHIN WITH DAGESH → HEBREW LETTER SHIN WITH SHIN DOT # + +FB2D ; FB2C ; MA # ( ‎שּׂ‎ → ‎שּׁ‎ ) HEBREW LETTER SHIN WITH DAGESH AND SIN DOT → HEBREW LETTER SHIN WITH DAGESH AND SHIN DOT # + +FB28 ; 05EA ; MA # ( ‎ﬨ‎ → ‎ת‎ ) HEBREW LETTER WIDE TAV → HEBREW LETTER TAV # + +FE80 ; 0621 ; MA # ( ‎ﺀ‎ → ‎ء‎ ) ARABIC LETTER HAMZA ISOLATED FORM → ARABIC LETTER HAMZA # + +06FD ; 0621 10EFA ; MA #* ( ‎۽‎ → ‎ء𐻺‎ ) ARABIC SIGN SINDHI AMPERSAND → ARABIC LETTER HAMZA, ARABIC DOUBLE VERTICAL BAR BELOW # + +FE82 ; 0622 ; MA # ( ‎ﺂ‎ → ‎آ‎ ) ARABIC LETTER ALEF WITH MADDA ABOVE FINAL FORM → ARABIC LETTER ALEF WITH MADDA ABOVE # +FE81 ; 0622 ; MA # ( ‎ﺁ‎ → ‎آ‎ ) ARABIC LETTER ALEF WITH MADDA ABOVE ISOLATED FORM → ARABIC LETTER ALEF WITH MADDA ABOVE # + +FB51 ; 0671 ; MA # ( ‎ﭑ‎ → ‎ٱ‎ ) ARABIC LETTER ALEF WASLA FINAL FORM → ARABIC LETTER ALEF WASLA # +FB50 ; 0671 ; MA # ( ‎ﭐ‎ → ‎ٱ‎ ) ARABIC LETTER ALEF WASLA ISOLATED FORM → ARABIC LETTER ALEF WASLA # + +1EE01 ; 0628 ; MA # ( ‎𞸁‎ → ‎ب‎ ) ARABIC MATHEMATICAL BEH → ARABIC LETTER BEH # +1EE21 ; 0628 ; MA # ( ‎𞸡‎ → ‎ب‎ ) ARABIC MATHEMATICAL INITIAL BEH → ARABIC LETTER BEH # +1EE61 ; 0628 ; MA # ( ‎𞹡‎ → ‎ب‎ ) ARABIC MATHEMATICAL STRETCHED BEH → ARABIC LETTER BEH # +1EE81 ; 0628 ; MA # ( ‎𞺁‎ → ‎ب‎ ) ARABIC MATHEMATICAL LOOPED BEH → ARABIC LETTER BEH # +1EEA1 ; 0628 ; MA # ( ‎𞺡‎ → ‎ب‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK BEH → ARABIC LETTER BEH # +FE91 ; 0628 ; MA # ( ‎ﺑ‎ → ‎ب‎ ) ARABIC LETTER BEH INITIAL FORM → ARABIC LETTER BEH # +FE92 ; 0628 ; MA # ( ‎ﺒ‎ → ‎ب‎ ) ARABIC LETTER BEH MEDIAL FORM → ARABIC LETTER BEH # +FE90 ; 0628 ; MA # ( ‎ﺐ‎ → ‎ب‎ ) ARABIC LETTER BEH FINAL FORM → ARABIC LETTER BEH # +FE8F ; 0628 ; MA # ( ‎ﺏ‎ → ‎ب‎ ) ARABIC LETTER BEH ISOLATED FORM → ARABIC LETTER BEH # + +0751 ; 0628 06DB ; MA # ( ‎ݑ‎ → ‎بۛ‎ ) ARABIC LETTER BEH WITH DOT BELOW AND THREE DOTS ABOVE → ARABIC LETTER BEH, ARABIC SMALL HIGH THREE DOTS # + +08B6 ; 0628 06E2 ; MA # ( ‎ࢶ‎ → ‎بۢ‎ ) ARABIC LETTER BEH WITH SMALL MEEM ABOVE → ARABIC LETTER BEH, ARABIC SMALL HIGH MEEM ISOLATED FORM # + +08A1 ; 0628 0654 ; MA # ( ‎ࢡ‎ → ‎بٔ‎ ) ARABIC LETTER BEH WITH HAMZA ABOVE → ARABIC LETTER BEH, ARABIC HAMZA ABOVE # + +FCA0 ; 0628 006F ; MA # ( ‎ﲠ‎ → ‎بo‎ ) ARABIC LIGATURE BEH WITH HEH INITIAL FORM → ARABIC LETTER BEH, LATIN SMALL LETTER O # →‎به‎→ +FCE2 ; 0628 006F ; MA # ( ‎ﳢ‎ → ‎بo‎ ) ARABIC LIGATURE BEH WITH HEH MEDIAL FORM → ARABIC LETTER BEH, LATIN SMALL LETTER O # →‎به‎→ + +FC9C ; 0628 062C ; MA # ( ‎ﲜ‎ → ‎بج‎ ) ARABIC LIGATURE BEH WITH JEEM INITIAL FORM → ARABIC LETTER BEH, ARABIC LETTER JEEM # +FC05 ; 0628 062C ; MA # ( ‎ﰅ‎ → ‎بج‎ ) ARABIC LIGATURE BEH WITH JEEM ISOLATED FORM → ARABIC LETTER BEH, ARABIC LETTER JEEM # + +FC9D ; 0628 062D ; MA # ( ‎ﲝ‎ → ‎بح‎ ) ARABIC LIGATURE BEH WITH HAH INITIAL FORM → ARABIC LETTER BEH, ARABIC LETTER HAH # +FC06 ; 0628 062D ; MA # ( ‎ﰆ‎ → ‎بح‎ ) ARABIC LIGATURE BEH WITH HAH ISOLATED FORM → ARABIC LETTER BEH, ARABIC LETTER HAH # + +FDC2 ; 0628 062D 0649 ; MA # ( ‎ﷂ‎ → ‎بحى‎ ) ARABIC LIGATURE BEH WITH HAH WITH YEH FINAL FORM → ARABIC LETTER BEH, ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # →‎بحي‎→ + +FC9E ; 0628 062E ; MA # ( ‎ﲞ‎ → ‎بخ‎ ) ARABIC LIGATURE BEH WITH KHAH INITIAL FORM → ARABIC LETTER BEH, ARABIC LETTER KHAH # +FC07 ; 0628 062E ; MA # ( ‎ﰇ‎ → ‎بخ‎ ) ARABIC LIGATURE BEH WITH KHAH ISOLATED FORM → ARABIC LETTER BEH, ARABIC LETTER KHAH # +FCD2 ; 0628 062E ; MA # ( ‎ﳒ‎ → ‎بخ‎ ) ARABIC LIGATURE NOON WITH JEEM INITIAL FORM → ARABIC LETTER BEH, ARABIC LETTER KHAH # →‎ﲞ‎→ +FC4B ; 0628 062E ; MA # ( ‎ﱋ‎ → ‎بخ‎ ) ARABIC LIGATURE NOON WITH JEEM ISOLATED FORM → ARABIC LETTER BEH, ARABIC LETTER KHAH # →‎نج‎→→‎ﳒ‎→→‎ﲞ‎→ + +FD9E ; 0628 062E 0649 ; MA # ( ‎ﶞ‎ → ‎بخى‎ ) ARABIC LIGATURE BEH WITH KHAH WITH YEH FINAL FORM → ARABIC LETTER BEH, ARABIC LETTER KHAH, ARABIC LETTER ALEF MAKSURA # →‎بخي‎→ + +FC6A ; 0628 0631 ; MA # ( ‎ﱪ‎ → ‎بر‎ ) ARABIC LIGATURE BEH WITH REH FINAL FORM → ARABIC LETTER BEH, ARABIC LETTER REH # + +FC6B ; 0628 0632 ; MA # ( ‎ﱫ‎ → ‎بز‎ ) ARABIC LIGATURE BEH WITH ZAIN FINAL FORM → ARABIC LETTER BEH, ARABIC LETTER ZAIN # + +FC9F ; 0628 0645 ; MA # ( ‎ﲟ‎ → ‎بم‎ ) ARABIC LIGATURE BEH WITH MEEM INITIAL FORM → ARABIC LETTER BEH, ARABIC LETTER MEEM # +FCE1 ; 0628 0645 ; MA # ( ‎ﳡ‎ → ‎بم‎ ) ARABIC LIGATURE BEH WITH MEEM MEDIAL FORM → ARABIC LETTER BEH, ARABIC LETTER MEEM # +FC6C ; 0628 0645 ; MA # ( ‎ﱬ‎ → ‎بم‎ ) ARABIC LIGATURE BEH WITH MEEM FINAL FORM → ARABIC LETTER BEH, ARABIC LETTER MEEM # +FC08 ; 0628 0645 ; MA # ( ‎ﰈ‎ → ‎بم‎ ) ARABIC LIGATURE BEH WITH MEEM ISOLATED FORM → ARABIC LETTER BEH, ARABIC LETTER MEEM # + +FC6D ; 0628 0646 ; MA # ( ‎ﱭ‎ → ‎بن‎ ) ARABIC LIGATURE BEH WITH NOON FINAL FORM → ARABIC LETTER BEH, ARABIC LETTER NOON # + +FC6E ; 0628 0649 ; MA # ( ‎ﱮ‎ → ‎بى‎ ) ARABIC LIGATURE BEH WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER BEH, ARABIC LETTER ALEF MAKSURA # +FC09 ; 0628 0649 ; MA # ( ‎ﰉ‎ → ‎بى‎ ) ARABIC LIGATURE BEH WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER BEH, ARABIC LETTER ALEF MAKSURA # +FC6F ; 0628 0649 ; MA # ( ‎ﱯ‎ → ‎بى‎ ) ARABIC LIGATURE BEH WITH YEH FINAL FORM → ARABIC LETTER BEH, ARABIC LETTER ALEF MAKSURA # →‎بي‎→ +FC0A ; 0628 0649 ; MA # ( ‎ﰊ‎ → ‎بى‎ ) ARABIC LIGATURE BEH WITH YEH ISOLATED FORM → ARABIC LETTER BEH, ARABIC LETTER ALEF MAKSURA # →‎بي‎→ + +FB54 ; 067B ; MA # ( ‎ﭔ‎ → ‎ٻ‎ ) ARABIC LETTER BEEH INITIAL FORM → ARABIC LETTER BEEH # +FB55 ; 067B ; MA # ( ‎ﭕ‎ → ‎ٻ‎ ) ARABIC LETTER BEEH MEDIAL FORM → ARABIC LETTER BEEH # +FB53 ; 067B ; MA # ( ‎ﭓ‎ → ‎ٻ‎ ) ARABIC LETTER BEEH FINAL FORM → ARABIC LETTER BEEH # +FB52 ; 067B ; MA # ( ‎ﭒ‎ → ‎ٻ‎ ) ARABIC LETTER BEEH ISOLATED FORM → ARABIC LETTER BEEH # +06D0 ; 067B ; MA # ( ‎ې‎ → ‎ٻ‎ ) ARABIC LETTER E → ARABIC LETTER BEEH # +FBE6 ; 067B ; MA # ( ‎ﯦ‎ → ‎ٻ‎ ) ARABIC LETTER E INITIAL FORM → ARABIC LETTER BEEH # →‎ې‎→ +FBE7 ; 067B ; MA # ( ‎ﯧ‎ → ‎ٻ‎ ) ARABIC LETTER E MEDIAL FORM → ARABIC LETTER BEEH # →‎ې‎→ +FBE5 ; 067B ; MA # ( ‎ﯥ‎ → ‎ٻ‎ ) ARABIC LETTER E FINAL FORM → ARABIC LETTER BEEH # →‎ې‎→ +FBE4 ; 067B ; MA # ( ‎ﯤ‎ → ‎ٻ‎ ) ARABIC LETTER E ISOLATED FORM → ARABIC LETTER BEEH # →‎ې‎→ + +FB5C ; 0680 ; MA # ( ‎ﭜ‎ → ‎ڀ‎ ) ARABIC LETTER BEHEH INITIAL FORM → ARABIC LETTER BEHEH # +FB5D ; 0680 ; MA # ( ‎ﭝ‎ → ‎ڀ‎ ) ARABIC LETTER BEHEH MEDIAL FORM → ARABIC LETTER BEHEH # +FB5B ; 0680 ; MA # ( ‎ﭛ‎ → ‎ڀ‎ ) ARABIC LETTER BEHEH FINAL FORM → ARABIC LETTER BEHEH # +FB5A ; 0680 ; MA # ( ‎ﭚ‎ → ‎ڀ‎ ) ARABIC LETTER BEHEH ISOLATED FORM → ARABIC LETTER BEHEH # +10EC7 ; 0680 ; MA # ( ‎𐻇‎ → ‎ڀ‎ ) ARABIC LETTER YEH WITH FOUR DOTS BELOW → ARABIC LETTER BEHEH # + +08A9 ; 0754 ; MA # ( ‎ࢩ‎ → ‎ݔ‎ ) ARABIC LETTER YEH WITH TWO DOTS BELOW AND DOT ABOVE → ARABIC LETTER BEH WITH TWO DOTS BELOW AND DOT ABOVE # +0767 ; 0754 ; MA # ( ‎ݧ‎ → ‎ݔ‎ ) ARABIC LETTER NOON WITH TWO DOTS BELOW → ARABIC LETTER BEH WITH TWO DOTS BELOW AND DOT ABOVE # + +2365 ; 0629 ; MA #* ( ⍥ → ‎ة‎ ) APL FUNCTIONAL SYMBOL CIRCLE DIAERESIS → ARABIC LETTER TEH MARBUTA # →ö→ +00F6 ; 0629 ; MA # ( ö → ‎ة‎ ) LATIN SMALL LETTER O WITH DIAERESIS → ARABIC LETTER TEH MARBUTA # +FE94 ; 0629 ; MA # ( ‎ﺔ‎ → ‎ة‎ ) ARABIC LETTER TEH MARBUTA FINAL FORM → ARABIC LETTER TEH MARBUTA # +FE93 ; 0629 ; MA # ( ‎ﺓ‎ → ‎ة‎ ) ARABIC LETTER TEH MARBUTA ISOLATED FORM → ARABIC LETTER TEH MARBUTA # +06C3 ; 0629 ; MA # ( ‎ۃ‎ → ‎ة‎ ) ARABIC LETTER TEH MARBUTA GOAL → ARABIC LETTER TEH MARBUTA # + +1EE15 ; 062A ; MA # ( ‎𞸕‎ → ‎ت‎ ) ARABIC MATHEMATICAL TEH → ARABIC LETTER TEH # +1EE35 ; 062A ; MA # ( ‎𞸵‎ → ‎ت‎ ) ARABIC MATHEMATICAL INITIAL TEH → ARABIC LETTER TEH # +1EE75 ; 062A ; MA # ( ‎𞹵‎ → ‎ت‎ ) ARABIC MATHEMATICAL STRETCHED TEH → ARABIC LETTER TEH # +1EE95 ; 062A ; MA # ( ‎𞺕‎ → ‎ت‎ ) ARABIC MATHEMATICAL LOOPED TEH → ARABIC LETTER TEH # +1EEB5 ; 062A ; MA # ( ‎𞺵‎ → ‎ت‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK TEH → ARABIC LETTER TEH # +FE97 ; 062A ; MA # ( ‎ﺗ‎ → ‎ت‎ ) ARABIC LETTER TEH INITIAL FORM → ARABIC LETTER TEH # +FE98 ; 062A ; MA # ( ‎ﺘ‎ → ‎ت‎ ) ARABIC LETTER TEH MEDIAL FORM → ARABIC LETTER TEH # +FE96 ; 062A ; MA # ( ‎ﺖ‎ → ‎ت‎ ) ARABIC LETTER TEH FINAL FORM → ARABIC LETTER TEH # +FE95 ; 062A ; MA # ( ‎ﺕ‎ → ‎ت‎ ) ARABIC LETTER TEH ISOLATED FORM → ARABIC LETTER TEH # +067A ; 062A ; MA # ( ‎ٺ‎ → ‎ت‎ ) ARABIC LETTER TTEHEH → ARABIC LETTER TEH # +FB60 ; 062A ; MA # ( ‎ﭠ‎ → ‎ت‎ ) ARABIC LETTER TTEHEH INITIAL FORM → ARABIC LETTER TEH # →‎ٺ‎→ +FB61 ; 062A ; MA # ( ‎ﭡ‎ → ‎ت‎ ) ARABIC LETTER TTEHEH MEDIAL FORM → ARABIC LETTER TEH # →‎ٺ‎→ +FB5F ; 062A ; MA # ( ‎ﭟ‎ → ‎ت‎ ) ARABIC LETTER TTEHEH FINAL FORM → ARABIC LETTER TEH # →‎ٺ‎→ +FB5E ; 062A ; MA # ( ‎ﭞ‎ → ‎ت‎ ) ARABIC LETTER TTEHEH ISOLATED FORM → ARABIC LETTER TEH # →‎ٺ‎→ + +08BF ; 062A 0306 ; MA # ( ‎ࢿ‎ → ‎ت̆‎ ) ARABIC LETTER TEH WITH SMALL V → ARABIC LETTER TEH, COMBINING BREVE # →‎تٚ‎→ + +FCA5 ; 062A 006F ; MA # ( ‎ﲥ‎ → ‎تo‎ ) ARABIC LIGATURE TEH WITH HEH INITIAL FORM → ARABIC LETTER TEH, LATIN SMALL LETTER O # →‎ته‎→ +FCE4 ; 062A 006F ; MA # ( ‎ﳤ‎ → ‎تo‎ ) ARABIC LIGATURE TEH WITH HEH MEDIAL FORM → ARABIC LETTER TEH, LATIN SMALL LETTER O # →‎ته‎→ + +FCA1 ; 062A 062C ; MA # ( ‎ﲡ‎ → ‎تج‎ ) ARABIC LIGATURE TEH WITH JEEM INITIAL FORM → ARABIC LETTER TEH, ARABIC LETTER JEEM # +FC0B ; 062A 062C ; MA # ( ‎ﰋ‎ → ‎تج‎ ) ARABIC LIGATURE TEH WITH JEEM ISOLATED FORM → ARABIC LETTER TEH, ARABIC LETTER JEEM # + +FD50 ; 062A 062C 0645 ; MA # ( ‎ﵐ‎ → ‎تجم‎ ) ARABIC LIGATURE TEH WITH JEEM WITH MEEM INITIAL FORM → ARABIC LETTER TEH, ARABIC LETTER JEEM, ARABIC LETTER MEEM # + +FDA0 ; 062A 062C 0649 ; MA # ( ‎ﶠ‎ → ‎تجى‎ ) ARABIC LIGATURE TEH WITH JEEM WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER TEH, ARABIC LETTER JEEM, ARABIC LETTER ALEF MAKSURA # +FD9F ; 062A 062C 0649 ; MA # ( ‎ﶟ‎ → ‎تجى‎ ) ARABIC LIGATURE TEH WITH JEEM WITH YEH FINAL FORM → ARABIC LETTER TEH, ARABIC LETTER JEEM, ARABIC LETTER ALEF MAKSURA # →‎تجي‎→ + +FCA2 ; 062A 062D ; MA # ( ‎ﲢ‎ → ‎تح‎ ) ARABIC LIGATURE TEH WITH HAH INITIAL FORM → ARABIC LETTER TEH, ARABIC LETTER HAH # +FC0C ; 062A 062D ; MA # ( ‎ﰌ‎ → ‎تح‎ ) ARABIC LIGATURE TEH WITH HAH ISOLATED FORM → ARABIC LETTER TEH, ARABIC LETTER HAH # + +FD52 ; 062A 062D 062C ; MA # ( ‎ﵒ‎ → ‎تحج‎ ) ARABIC LIGATURE TEH WITH HAH WITH JEEM INITIAL FORM → ARABIC LETTER TEH, ARABIC LETTER HAH, ARABIC LETTER JEEM # +FD51 ; 062A 062D 062C ; MA # ( ‎ﵑ‎ → ‎تحج‎ ) ARABIC LIGATURE TEH WITH HAH WITH JEEM FINAL FORM → ARABIC LETTER TEH, ARABIC LETTER HAH, ARABIC LETTER JEEM # + +FD53 ; 062A 062D 0645 ; MA # ( ‎ﵓ‎ → ‎تحم‎ ) ARABIC LIGATURE TEH WITH HAH WITH MEEM INITIAL FORM → ARABIC LETTER TEH, ARABIC LETTER HAH, ARABIC LETTER MEEM # + +FCA3 ; 062A 062E ; MA # ( ‎ﲣ‎ → ‎تخ‎ ) ARABIC LIGATURE TEH WITH KHAH INITIAL FORM → ARABIC LETTER TEH, ARABIC LETTER KHAH # +FC0D ; 062A 062E ; MA # ( ‎ﰍ‎ → ‎تخ‎ ) ARABIC LIGATURE TEH WITH KHAH ISOLATED FORM → ARABIC LETTER TEH, ARABIC LETTER KHAH # + +FD54 ; 062A 062E 0645 ; MA # ( ‎ﵔ‎ → ‎تخم‎ ) ARABIC LIGATURE TEH WITH KHAH WITH MEEM INITIAL FORM → ARABIC LETTER TEH, ARABIC LETTER KHAH, ARABIC LETTER MEEM # + +FDA2 ; 062A 062E 0649 ; MA # ( ‎ﶢ‎ → ‎تخى‎ ) ARABIC LIGATURE TEH WITH KHAH WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER TEH, ARABIC LETTER KHAH, ARABIC LETTER ALEF MAKSURA # +FDA1 ; 062A 062E 0649 ; MA # ( ‎ﶡ‎ → ‎تخى‎ ) ARABIC LIGATURE TEH WITH KHAH WITH YEH FINAL FORM → ARABIC LETTER TEH, ARABIC LETTER KHAH, ARABIC LETTER ALEF MAKSURA # →‎تخي‎→ + +FC70 ; 062A 0631 ; MA # ( ‎ﱰ‎ → ‎تر‎ ) ARABIC LIGATURE TEH WITH REH FINAL FORM → ARABIC LETTER TEH, ARABIC LETTER REH # + +FC71 ; 062A 0632 ; MA # ( ‎ﱱ‎ → ‎تز‎ ) ARABIC LIGATURE TEH WITH ZAIN FINAL FORM → ARABIC LETTER TEH, ARABIC LETTER ZAIN # + +FCA4 ; 062A 0645 ; MA # ( ‎ﲤ‎ → ‎تم‎ ) ARABIC LIGATURE TEH WITH MEEM INITIAL FORM → ARABIC LETTER TEH, ARABIC LETTER MEEM # +FCE3 ; 062A 0645 ; MA # ( ‎ﳣ‎ → ‎تم‎ ) ARABIC LIGATURE TEH WITH MEEM MEDIAL FORM → ARABIC LETTER TEH, ARABIC LETTER MEEM # +FC72 ; 062A 0645 ; MA # ( ‎ﱲ‎ → ‎تم‎ ) ARABIC LIGATURE TEH WITH MEEM FINAL FORM → ARABIC LETTER TEH, ARABIC LETTER MEEM # +FC0E ; 062A 0645 ; MA # ( ‎ﰎ‎ → ‎تم‎ ) ARABIC LIGATURE TEH WITH MEEM ISOLATED FORM → ARABIC LETTER TEH, ARABIC LETTER MEEM # + +FD55 ; 062A 0645 062C ; MA # ( ‎ﵕ‎ → ‎تمج‎ ) ARABIC LIGATURE TEH WITH MEEM WITH JEEM INITIAL FORM → ARABIC LETTER TEH, ARABIC LETTER MEEM, ARABIC LETTER JEEM # + +FD56 ; 062A 0645 062D ; MA # ( ‎ﵖ‎ → ‎تمح‎ ) ARABIC LIGATURE TEH WITH MEEM WITH HAH INITIAL FORM → ARABIC LETTER TEH, ARABIC LETTER MEEM, ARABIC LETTER HAH # + +FD57 ; 062A 0645 062E ; MA # ( ‎ﵗ‎ → ‎تمخ‎ ) ARABIC LIGATURE TEH WITH MEEM WITH KHAH INITIAL FORM → ARABIC LETTER TEH, ARABIC LETTER MEEM, ARABIC LETTER KHAH # + +FDA4 ; 062A 0645 0649 ; MA # ( ‎ﶤ‎ → ‎تمى‎ ) ARABIC LIGATURE TEH WITH MEEM WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER TEH, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # +FDA3 ; 062A 0645 0649 ; MA # ( ‎ﶣ‎ → ‎تمى‎ ) ARABIC LIGATURE TEH WITH MEEM WITH YEH FINAL FORM → ARABIC LETTER TEH, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # →‎تمي‎→ + +FC73 ; 062A 0646 ; MA # ( ‎ﱳ‎ → ‎تن‎ ) ARABIC LIGATURE TEH WITH NOON FINAL FORM → ARABIC LETTER TEH, ARABIC LETTER NOON # + +FC74 ; 062A 0649 ; MA # ( ‎ﱴ‎ → ‎تى‎ ) ARABIC LIGATURE TEH WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER TEH, ARABIC LETTER ALEF MAKSURA # +FC0F ; 062A 0649 ; MA # ( ‎ﰏ‎ → ‎تى‎ ) ARABIC LIGATURE TEH WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER TEH, ARABIC LETTER ALEF MAKSURA # +FC75 ; 062A 0649 ; MA # ( ‎ﱵ‎ → ‎تى‎ ) ARABIC LIGATURE TEH WITH YEH FINAL FORM → ARABIC LETTER TEH, ARABIC LETTER ALEF MAKSURA # →‎تي‎→ +FC10 ; 062A 0649 ; MA # ( ‎ﰐ‎ → ‎تى‎ ) ARABIC LIGATURE TEH WITH YEH ISOLATED FORM → ARABIC LETTER TEH, ARABIC LETTER ALEF MAKSURA # →‎تي‎→ + +FB64 ; 067F ; MA # ( ‎ﭤ‎ → ‎ٿ‎ ) ARABIC LETTER TEHEH INITIAL FORM → ARABIC LETTER TEHEH # +FB65 ; 067F ; MA # ( ‎ﭥ‎ → ‎ٿ‎ ) ARABIC LETTER TEHEH MEDIAL FORM → ARABIC LETTER TEHEH # +FB63 ; 067F ; MA # ( ‎ﭣ‎ → ‎ٿ‎ ) ARABIC LETTER TEHEH FINAL FORM → ARABIC LETTER TEHEH # +FB62 ; 067F ; MA # ( ‎ﭢ‎ → ‎ٿ‎ ) ARABIC LETTER TEHEH ISOLATED FORM → ARABIC LETTER TEHEH # + +1EE02 ; 062C ; MA # ( ‎𞸂‎ → ‎ج‎ ) ARABIC MATHEMATICAL JEEM → ARABIC LETTER JEEM # +1EE22 ; 062C ; MA # ( ‎𞸢‎ → ‎ج‎ ) ARABIC MATHEMATICAL INITIAL JEEM → ARABIC LETTER JEEM # +1EE42 ; 062C ; MA # ( ‎𞹂‎ → ‎ج‎ ) ARABIC MATHEMATICAL TAILED JEEM → ARABIC LETTER JEEM # +1EE62 ; 062C ; MA # ( ‎𞹢‎ → ‎ج‎ ) ARABIC MATHEMATICAL STRETCHED JEEM → ARABIC LETTER JEEM # +1EE82 ; 062C ; MA # ( ‎𞺂‎ → ‎ج‎ ) ARABIC MATHEMATICAL LOOPED JEEM → ARABIC LETTER JEEM # +1EEA2 ; 062C ; MA # ( ‎𞺢‎ → ‎ج‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK JEEM → ARABIC LETTER JEEM # +FE9F ; 062C ; MA # ( ‎ﺟ‎ → ‎ج‎ ) ARABIC LETTER JEEM INITIAL FORM → ARABIC LETTER JEEM # +FEA0 ; 062C ; MA # ( ‎ﺠ‎ → ‎ج‎ ) ARABIC LETTER JEEM MEDIAL FORM → ARABIC LETTER JEEM # +FE9E ; 062C ; MA # ( ‎ﺞ‎ → ‎ج‎ ) ARABIC LETTER JEEM FINAL FORM → ARABIC LETTER JEEM # +FE9D ; 062C ; MA # ( ‎ﺝ‎ → ‎ج‎ ) ARABIC LETTER JEEM ISOLATED FORM → ARABIC LETTER JEEM # + +FCA7 ; 062C 062D ; MA # ( ‎ﲧ‎ → ‎جح‎ ) ARABIC LIGATURE JEEM WITH HAH INITIAL FORM → ARABIC LETTER JEEM, ARABIC LETTER HAH # +FC15 ; 062C 062D ; MA # ( ‎ﰕ‎ → ‎جح‎ ) ARABIC LIGATURE JEEM WITH HAH ISOLATED FORM → ARABIC LETTER JEEM, ARABIC LETTER HAH # + +FDA6 ; 062C 062D 0649 ; MA # ( ‎ﶦ‎ → ‎جحى‎ ) ARABIC LIGATURE JEEM WITH HAH WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER JEEM, ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # +FDBE ; 062C 062D 0649 ; MA # ( ‎ﶾ‎ → ‎جحى‎ ) ARABIC LIGATURE JEEM WITH HAH WITH YEH FINAL FORM → ARABIC LETTER JEEM, ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # →‎جحي‎→ + +FDFB ; 062C 0644 0020 062C 0644 006C 0644 006F ; MA #* ( ‎ﷻ‎ → ‎جل جلlلo‎ ) ARABIC LIGATURE JALLAJALALOUHOU → ARABIC LETTER JEEM, ARABIC LETTER LAM, SPACE, ARABIC LETTER JEEM, ARABIC LETTER LAM, LATIN SMALL LETTER L, ARABIC LETTER LAM, LATIN SMALL LETTER O # →‎جل جلاله‎→ + +FCA8 ; 062C 0645 ; MA # ( ‎ﲨ‎ → ‎جم‎ ) ARABIC LIGATURE JEEM WITH MEEM INITIAL FORM → ARABIC LETTER JEEM, ARABIC LETTER MEEM # +FC16 ; 062C 0645 ; MA # ( ‎ﰖ‎ → ‎جم‎ ) ARABIC LIGATURE JEEM WITH MEEM ISOLATED FORM → ARABIC LETTER JEEM, ARABIC LETTER MEEM # + +FD59 ; 062C 0645 062D ; MA # ( ‎ﵙ‎ → ‎جمح‎ ) ARABIC LIGATURE JEEM WITH MEEM WITH HAH INITIAL FORM → ARABIC LETTER JEEM, ARABIC LETTER MEEM, ARABIC LETTER HAH # +FD58 ; 062C 0645 062D ; MA # ( ‎ﵘ‎ → ‎جمح‎ ) ARABIC LIGATURE JEEM WITH MEEM WITH HAH FINAL FORM → ARABIC LETTER JEEM, ARABIC LETTER MEEM, ARABIC LETTER HAH # + +FDA7 ; 062C 0645 0649 ; MA # ( ‎ﶧ‎ → ‎جمى‎ ) ARABIC LIGATURE JEEM WITH MEEM WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER JEEM, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # +FDA5 ; 062C 0645 0649 ; MA # ( ‎ﶥ‎ → ‎جمى‎ ) ARABIC LIGATURE JEEM WITH MEEM WITH YEH FINAL FORM → ARABIC LETTER JEEM, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # →‎جمي‎→ + +FD1D ; 062C 0649 ; MA # ( ‎ﴝ‎ → ‎جى‎ ) ARABIC LIGATURE JEEM WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER JEEM, ARABIC LETTER ALEF MAKSURA # +FD01 ; 062C 0649 ; MA # ( ‎ﴁ‎ → ‎جى‎ ) ARABIC LIGATURE JEEM WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER JEEM, ARABIC LETTER ALEF MAKSURA # +FD1E ; 062C 0649 ; MA # ( ‎ﴞ‎ → ‎جى‎ ) ARABIC LIGATURE JEEM WITH YEH FINAL FORM → ARABIC LETTER JEEM, ARABIC LETTER ALEF MAKSURA # →‎جي‎→ +FD02 ; 062C 0649 ; MA # ( ‎ﴂ‎ → ‎جى‎ ) ARABIC LIGATURE JEEM WITH YEH ISOLATED FORM → ARABIC LETTER JEEM, ARABIC LETTER ALEF MAKSURA # →‎جي‎→ + +FB78 ; 0683 ; MA # ( ‎ﭸ‎ → ‎ڃ‎ ) ARABIC LETTER NYEH INITIAL FORM → ARABIC LETTER NYEH # +FB79 ; 0683 ; MA # ( ‎ﭹ‎ → ‎ڃ‎ ) ARABIC LETTER NYEH MEDIAL FORM → ARABIC LETTER NYEH # +FB77 ; 0683 ; MA # ( ‎ﭷ‎ → ‎ڃ‎ ) ARABIC LETTER NYEH FINAL FORM → ARABIC LETTER NYEH # +FB76 ; 0683 ; MA # ( ‎ﭶ‎ → ‎ڃ‎ ) ARABIC LETTER NYEH ISOLATED FORM → ARABIC LETTER NYEH # + +FB74 ; 0684 ; MA # ( ‎ﭴ‎ → ‎ڄ‎ ) ARABIC LETTER DYEH INITIAL FORM → ARABIC LETTER DYEH # +FB75 ; 0684 ; MA # ( ‎ﭵ‎ → ‎ڄ‎ ) ARABIC LETTER DYEH MEDIAL FORM → ARABIC LETTER DYEH # +FB73 ; 0684 ; MA # ( ‎ﭳ‎ → ‎ڄ‎ ) ARABIC LETTER DYEH FINAL FORM → ARABIC LETTER DYEH # +FB72 ; 0684 ; MA # ( ‎ﭲ‎ → ‎ڄ‎ ) ARABIC LETTER DYEH ISOLATED FORM → ARABIC LETTER DYEH # + +FB7C ; 0686 ; MA # ( ‎ﭼ‎ → ‎چ‎ ) ARABIC LETTER TCHEH INITIAL FORM → ARABIC LETTER TCHEH # +FB7D ; 0686 ; MA # ( ‎ﭽ‎ → ‎چ‎ ) ARABIC LETTER TCHEH MEDIAL FORM → ARABIC LETTER TCHEH # +FB7B ; 0686 ; MA # ( ‎ﭻ‎ → ‎چ‎ ) ARABIC LETTER TCHEH FINAL FORM → ARABIC LETTER TCHEH # +FB7A ; 0686 ; MA # ( ‎ﭺ‎ → ‎چ‎ ) ARABIC LETTER TCHEH ISOLATED FORM → ARABIC LETTER TCHEH # + +08C1 ; 0686 0306 ; MA # ( ‎ࣁ‎ → ‎چ̆‎ ) ARABIC LETTER TCHEH WITH SMALL V → ARABIC LETTER TCHEH, COMBINING BREVE # →‎چٚ‎→ + +FB80 ; 0687 ; MA # ( ‎ﮀ‎ → ‎ڇ‎ ) ARABIC LETTER TCHEHEH INITIAL FORM → ARABIC LETTER TCHEHEH # +FB81 ; 0687 ; MA # ( ‎ﮁ‎ → ‎ڇ‎ ) ARABIC LETTER TCHEHEH MEDIAL FORM → ARABIC LETTER TCHEHEH # +FB7F ; 0687 ; MA # ( ‎ﭿ‎ → ‎ڇ‎ ) ARABIC LETTER TCHEHEH FINAL FORM → ARABIC LETTER TCHEHEH # +FB7E ; 0687 ; MA # ( ‎ﭾ‎ → ‎ڇ‎ ) ARABIC LETTER TCHEHEH ISOLATED FORM → ARABIC LETTER TCHEHEH # + +1EE07 ; 062D ; MA # ( ‎𞸇‎ → ‎ح‎ ) ARABIC MATHEMATICAL HAH → ARABIC LETTER HAH # +1EE27 ; 062D ; MA # ( ‎𞸧‎ → ‎ح‎ ) ARABIC MATHEMATICAL INITIAL HAH → ARABIC LETTER HAH # +1EE47 ; 062D ; MA # ( ‎𞹇‎ → ‎ح‎ ) ARABIC MATHEMATICAL TAILED HAH → ARABIC LETTER HAH # +1EE67 ; 062D ; MA # ( ‎𞹧‎ → ‎ح‎ ) ARABIC MATHEMATICAL STRETCHED HAH → ARABIC LETTER HAH # +1EE87 ; 062D ; MA # ( ‎𞺇‎ → ‎ح‎ ) ARABIC MATHEMATICAL LOOPED HAH → ARABIC LETTER HAH # +1EEA7 ; 062D ; MA # ( ‎𞺧‎ → ‎ح‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK HAH → ARABIC LETTER HAH # +FEA3 ; 062D ; MA # ( ‎ﺣ‎ → ‎ح‎ ) ARABIC LETTER HAH INITIAL FORM → ARABIC LETTER HAH # +FEA4 ; 062D ; MA # ( ‎ﺤ‎ → ‎ح‎ ) ARABIC LETTER HAH MEDIAL FORM → ARABIC LETTER HAH # +FEA2 ; 062D ; MA # ( ‎ﺢ‎ → ‎ح‎ ) ARABIC LETTER HAH FINAL FORM → ARABIC LETTER HAH # +FEA1 ; 062D ; MA # ( ‎ﺡ‎ → ‎ح‎ ) ARABIC LETTER HAH ISOLATED FORM → ARABIC LETTER HAH # + +0685 ; 062D 06DB ; MA # ( ‎څ‎ → ‎حۛ‎ ) ARABIC LETTER HAH WITH THREE DOTS ABOVE → ARABIC LETTER HAH, ARABIC SMALL HIGH THREE DOTS # + +0681 ; 062D 0654 ; MA # ( ‎ځ‎ → ‎حٔ‎ ) ARABIC LETTER HAH WITH HAMZA ABOVE → ARABIC LETTER HAH, ARABIC HAMZA ABOVE # +0772 ; 062D 0654 ; MA # ( ‎ݲ‎ → ‎حٔ‎ ) ARABIC LETTER HAH WITH SMALL ARABIC LETTER TAH ABOVE → ARABIC LETTER HAH, ARABIC HAMZA ABOVE # + +FCA9 ; 062D 062C ; MA # ( ‎ﲩ‎ → ‎حج‎ ) ARABIC LIGATURE HAH WITH JEEM INITIAL FORM → ARABIC LETTER HAH, ARABIC LETTER JEEM # +FC17 ; 062D 062C ; MA # ( ‎ﰗ‎ → ‎حج‎ ) ARABIC LIGATURE HAH WITH JEEM ISOLATED FORM → ARABIC LETTER HAH, ARABIC LETTER JEEM # + +FDBF ; 062D 062C 0649 ; MA # ( ‎ﶿ‎ → ‎حجى‎ ) ARABIC LIGATURE HAH WITH JEEM WITH YEH FINAL FORM → ARABIC LETTER HAH, ARABIC LETTER JEEM, ARABIC LETTER ALEF MAKSURA # →‎حجي‎→ + +FCAA ; 062D 0645 ; MA # ( ‎ﲪ‎ → ‎حم‎ ) ARABIC LIGATURE HAH WITH MEEM INITIAL FORM → ARABIC LETTER HAH, ARABIC LETTER MEEM # +FC18 ; 062D 0645 ; MA # ( ‎ﰘ‎ → ‎حم‎ ) ARABIC LIGATURE HAH WITH MEEM ISOLATED FORM → ARABIC LETTER HAH, ARABIC LETTER MEEM # + +FD5B ; 062D 0645 0649 ; MA # ( ‎ﵛ‎ → ‎حمى‎ ) ARABIC LIGATURE HAH WITH MEEM WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER HAH, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # +FD5A ; 062D 0645 0649 ; MA # ( ‎ﵚ‎ → ‎حمى‎ ) ARABIC LIGATURE HAH WITH MEEM WITH YEH FINAL FORM → ARABIC LETTER HAH, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # →‎حمي‎→ + +FD1B ; 062D 0649 ; MA # ( ‎ﴛ‎ → ‎حى‎ ) ARABIC LIGATURE HAH WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # +FCFF ; 062D 0649 ; MA # ( ‎ﳿ‎ → ‎حى‎ ) ARABIC LIGATURE HAH WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # +FD1C ; 062D 0649 ; MA # ( ‎ﴜ‎ → ‎حى‎ ) ARABIC LIGATURE HAH WITH YEH FINAL FORM → ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # →‎حي‎→ +FD00 ; 062D 0649 ; MA # ( ‎ﴀ‎ → ‎حى‎ ) ARABIC LIGATURE HAH WITH YEH ISOLATED FORM → ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # →‎حي‎→ + +1EE17 ; 062E ; MA # ( ‎𞸗‎ → ‎خ‎ ) ARABIC MATHEMATICAL KHAH → ARABIC LETTER KHAH # +1EE37 ; 062E ; MA # ( ‎𞸷‎ → ‎خ‎ ) ARABIC MATHEMATICAL INITIAL KHAH → ARABIC LETTER KHAH # +1EE57 ; 062E ; MA # ( ‎𞹗‎ → ‎خ‎ ) ARABIC MATHEMATICAL TAILED KHAH → ARABIC LETTER KHAH # +1EE77 ; 062E ; MA # ( ‎𞹷‎ → ‎خ‎ ) ARABIC MATHEMATICAL STRETCHED KHAH → ARABIC LETTER KHAH # +1EE97 ; 062E ; MA # ( ‎𞺗‎ → ‎خ‎ ) ARABIC MATHEMATICAL LOOPED KHAH → ARABIC LETTER KHAH # +1EEB7 ; 062E ; MA # ( ‎𞺷‎ → ‎خ‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK KHAH → ARABIC LETTER KHAH # +FEA7 ; 062E ; MA # ( ‎ﺧ‎ → ‎خ‎ ) ARABIC LETTER KHAH INITIAL FORM → ARABIC LETTER KHAH # +FEA8 ; 062E ; MA # ( ‎ﺨ‎ → ‎خ‎ ) ARABIC LETTER KHAH MEDIAL FORM → ARABIC LETTER KHAH # +FEA6 ; 062E ; MA # ( ‎ﺦ‎ → ‎خ‎ ) ARABIC LETTER KHAH FINAL FORM → ARABIC LETTER KHAH # +FEA5 ; 062E ; MA # ( ‎ﺥ‎ → ‎خ‎ ) ARABIC LETTER KHAH ISOLATED FORM → ARABIC LETTER KHAH # + +FCAB ; 062E 062C ; MA # ( ‎ﲫ‎ → ‎خج‎ ) ARABIC LIGATURE KHAH WITH JEEM INITIAL FORM → ARABIC LETTER KHAH, ARABIC LETTER JEEM # +FC19 ; 062E 062C ; MA # ( ‎ﰙ‎ → ‎خج‎ ) ARABIC LIGATURE KHAH WITH JEEM ISOLATED FORM → ARABIC LETTER KHAH, ARABIC LETTER JEEM # + +FC1A ; 062E 062D ; MA # ( ‎ﰚ‎ → ‎خح‎ ) ARABIC LIGATURE KHAH WITH HAH ISOLATED FORM → ARABIC LETTER KHAH, ARABIC LETTER HAH # + +FCAC ; 062E 0645 ; MA # ( ‎ﲬ‎ → ‎خم‎ ) ARABIC LIGATURE KHAH WITH MEEM INITIAL FORM → ARABIC LETTER KHAH, ARABIC LETTER MEEM # +FC1B ; 062E 0645 ; MA # ( ‎ﰛ‎ → ‎خم‎ ) ARABIC LIGATURE KHAH WITH MEEM ISOLATED FORM → ARABIC LETTER KHAH, ARABIC LETTER MEEM # + +FD1F ; 062E 0649 ; MA # ( ‎ﴟ‎ → ‎خى‎ ) ARABIC LIGATURE KHAH WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER KHAH, ARABIC LETTER ALEF MAKSURA # +FD03 ; 062E 0649 ; MA # ( ‎ﴃ‎ → ‎خى‎ ) ARABIC LIGATURE KHAH WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER KHAH, ARABIC LETTER ALEF MAKSURA # +FD20 ; 062E 0649 ; MA # ( ‎ﴠ‎ → ‎خى‎ ) ARABIC LIGATURE KHAH WITH YEH FINAL FORM → ARABIC LETTER KHAH, ARABIC LETTER ALEF MAKSURA # →‎خي‎→ +FD04 ; 062E 0649 ; MA # ( ‎ﴄ‎ → ‎خى‎ ) ARABIC LIGATURE KHAH WITH YEH ISOLATED FORM → ARABIC LETTER KHAH, ARABIC LETTER ALEF MAKSURA # →‎خي‎→ + +102E1 ; 062F ; MA #* ( 𐋡 → ‎د‎ ) COPTIC EPACT DIGIT ONE → ARABIC LETTER DAL # +1EE03 ; 062F ; MA # ( ‎𞸃‎ → ‎د‎ ) ARABIC MATHEMATICAL DAL → ARABIC LETTER DAL # +1EE83 ; 062F ; MA # ( ‎𞺃‎ → ‎د‎ ) ARABIC MATHEMATICAL LOOPED DAL → ARABIC LETTER DAL # +1EEA3 ; 062F ; MA # ( ‎𞺣‎ → ‎د‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK DAL → ARABIC LETTER DAL # +FEAA ; 062F ; MA # ( ‎ﺪ‎ → ‎د‎ ) ARABIC LETTER DAL FINAL FORM → ARABIC LETTER DAL # +FEA9 ; 062F ; MA # ( ‎ﺩ‎ → ‎د‎ ) ARABIC LETTER DAL ISOLATED FORM → ARABIC LETTER DAL # + +0688 ; 062F 0615 ; MA # ( ‎ڈ‎ → ‎دؕ‎ ) ARABIC LETTER DDAL → ARABIC LETTER DAL, ARABIC SMALL HIGH TAH # +FB89 ; 062F 0615 ; MA # ( ‎ﮉ‎ → ‎دؕ‎ ) ARABIC LETTER DDAL FINAL FORM → ARABIC LETTER DAL, ARABIC SMALL HIGH TAH # →‎ڈ‎→ +FB88 ; 062F 0615 ; MA # ( ‎ﮈ‎ → ‎دؕ‎ ) ARABIC LETTER DDAL ISOLATED FORM → ARABIC LETTER DAL, ARABIC SMALL HIGH TAH # →‎ڈ‎→ + +068E ; 062F 06DB ; MA # ( ‎ڎ‎ → ‎دۛ‎ ) ARABIC LETTER DUL → ARABIC LETTER DAL, ARABIC SMALL HIGH THREE DOTS # +FB87 ; 062F 06DB ; MA # ( ‎ﮇ‎ → ‎دۛ‎ ) ARABIC LETTER DUL FINAL FORM → ARABIC LETTER DAL, ARABIC SMALL HIGH THREE DOTS # →‎ڎ‎→ +FB86 ; 062F 06DB ; MA # ( ‎ﮆ‎ → ‎دۛ‎ ) ARABIC LETTER DUL ISOLATED FORM → ARABIC LETTER DAL, ARABIC SMALL HIGH THREE DOTS # →‎ڎ‎→ +068F ; 062F 06DB ; MA # ( ‎ڏ‎ → ‎دۛ‎ ) ARABIC LETTER DAL WITH THREE DOTS ABOVE DOWNWARDS → ARABIC LETTER DAL, ARABIC SMALL HIGH THREE DOTS # →‎ڎ‎→ + +06EE ; 062F 0302 ; MA # ( ‎ۮ‎ → ‎د̂‎ ) ARABIC LETTER DAL WITH INVERTED V → ARABIC LETTER DAL, COMBINING CIRCUMFLEX ACCENT # →‎دٛ‎→ + +08AE ; 062F 0324 0323 ; MA # ( ‎ࢮ‎ → ‎د̤̣‎ ) ARABIC LETTER DAL WITH THREE DOTS BELOW → ARABIC LETTER DAL, COMBINING DIAERESIS BELOW, COMBINING DOT BELOW # →‎د࣮࣭‎→ + +1EE18 ; 0630 ; MA # ( ‎𞸘‎ → ‎ذ‎ ) ARABIC MATHEMATICAL THAL → ARABIC LETTER THAL # +1EE98 ; 0630 ; MA # ( ‎𞺘‎ → ‎ذ‎ ) ARABIC MATHEMATICAL LOOPED THAL → ARABIC LETTER THAL # +1EEB8 ; 0630 ; MA # ( ‎𞺸‎ → ‎ذ‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK THAL → ARABIC LETTER THAL # +FEAC ; 0630 ; MA # ( ‎ﺬ‎ → ‎ذ‎ ) ARABIC LETTER THAL FINAL FORM → ARABIC LETTER THAL # +FEAB ; 0630 ; MA # ( ‎ﺫ‎ → ‎ذ‎ ) ARABIC LETTER THAL ISOLATED FORM → ARABIC LETTER THAL # + +FC5B ; 0630 0670 ; MA # ( ‎ﱛ‎ → ‎ذٰ‎ ) ARABIC LIGATURE THAL WITH SUPERSCRIPT ALEF ISOLATED FORM → ARABIC LETTER THAL, ARABIC LETTER SUPERSCRIPT ALEF # + +068B ; 068A 0615 ; MA # ( ‎ڋ‎ → ‎ڊؕ‎ ) ARABIC LETTER DAL WITH DOT BELOW AND SMALL TAH → ARABIC LETTER DAL WITH DOT BELOW, ARABIC SMALL HIGH TAH # + +FB85 ; 068C ; MA # ( ‎ﮅ‎ → ‎ڌ‎ ) ARABIC LETTER DAHAL FINAL FORM → ARABIC LETTER DAHAL # +FB84 ; 068C ; MA # ( ‎ﮄ‎ → ‎ڌ‎ ) ARABIC LETTER DAHAL ISOLATED FORM → ARABIC LETTER DAHAL # + +FB83 ; 068D ; MA # ( ‎ﮃ‎ → ‎ڍ‎ ) ARABIC LETTER DDAHAL FINAL FORM → ARABIC LETTER DDAHAL # +FB82 ; 068D ; MA # ( ‎ﮂ‎ → ‎ڍ‎ ) ARABIC LETTER DDAHAL ISOLATED FORM → ARABIC LETTER DDAHAL # + +1EE13 ; 0631 ; MA # ( ‎𞸓‎ → ‎ر‎ ) ARABIC MATHEMATICAL REH → ARABIC LETTER REH # +1EE93 ; 0631 ; MA # ( ‎𞺓‎ → ‎ر‎ ) ARABIC MATHEMATICAL LOOPED REH → ARABIC LETTER REH # +1EEB3 ; 0631 ; MA # ( ‎𞺳‎ → ‎ر‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK REH → ARABIC LETTER REH # +FEAE ; 0631 ; MA # ( ‎ﺮ‎ → ‎ر‎ ) ARABIC LETTER REH FINAL FORM → ARABIC LETTER REH # +FEAD ; 0631 ; MA # ( ‎ﺭ‎ → ‎ر‎ ) ARABIC LETTER REH ISOLATED FORM → ARABIC LETTER REH # + +0691 ; 0631 0615 ; MA # ( ‎ڑ‎ → ‎رؕ‎ ) ARABIC LETTER RREH → ARABIC LETTER REH, ARABIC SMALL HIGH TAH # +FB8D ; 0631 0615 ; MA # ( ‎ﮍ‎ → ‎رؕ‎ ) ARABIC LETTER RREH FINAL FORM → ARABIC LETTER REH, ARABIC SMALL HIGH TAH # →‎ڑ‎→ +FB8C ; 0631 0615 ; MA # ( ‎ﮌ‎ → ‎رؕ‎ ) ARABIC LETTER RREH ISOLATED FORM → ARABIC LETTER REH, ARABIC SMALL HIGH TAH # →‎ڑ‎→ + +0698 ; 0631 06DB ; MA # ( ‎ژ‎ → ‎رۛ‎ ) ARABIC LETTER JEH → ARABIC LETTER REH, ARABIC SMALL HIGH THREE DOTS # +FB8B ; 0631 06DB ; MA # ( ‎ﮋ‎ → ‎رۛ‎ ) ARABIC LETTER JEH FINAL FORM → ARABIC LETTER REH, ARABIC SMALL HIGH THREE DOTS # →‎ژ‎→ +FB8A ; 0631 06DB ; MA # ( ‎ﮊ‎ → ‎رۛ‎ ) ARABIC LETTER JEH ISOLATED FORM → ARABIC LETTER REH, ARABIC SMALL HIGH THREE DOTS # →‎ژ‎→ + +0692 ; 0631 0306 ; MA # ( ‎ڒ‎ → ‎ر̆‎ ) ARABIC LETTER REH WITH SMALL V → ARABIC LETTER REH, COMBINING BREVE # →‎رٚ‎→ + +08B9 ; 0631 0306 0307 ; MA # ( ‎ࢹ‎ → ‎ر̆̇‎ ) ARABIC LETTER REH WITH SMALL NOON ABOVE → ARABIC LETTER REH, COMBINING BREVE, COMBINING DOT ABOVE # →‎رۨ‎→ + +06EF ; 0631 0302 ; MA # ( ‎ۯ‎ → ‎ر̂‎ ) ARABIC LETTER REH WITH INVERTED V → ARABIC LETTER REH, COMBINING CIRCUMFLEX ACCENT # →‎رٛ‎→ + +076C ; 0631 0654 ; MA # ( ‎ݬ‎ → ‎رٔ‎ ) ARABIC LETTER REH WITH HAMZA ABOVE → ARABIC LETTER REH, ARABIC HAMZA ABOVE # + +FC5C ; 0631 0670 ; MA # ( ‎ﱜ‎ → ‎رٰ‎ ) ARABIC LIGATURE REH WITH SUPERSCRIPT ALEF ISOLATED FORM → ARABIC LETTER REH, ARABIC LETTER SUPERSCRIPT ALEF # + +FDF6 ; 0631 0633 0648 0644 ; MA # ( ‎ﷶ‎ → ‎رسول‎ ) ARABIC LIGATURE RASOUL ISOLATED FORM → ARABIC LETTER REH, ARABIC LETTER SEEN, ARABIC LETTER WAW, ARABIC LETTER LAM # + +FDFC ; 0631 0649 006C 0644 ; MA #* ( ‎﷼‎ → ‎رىlل‎ ) RIAL SIGN → ARABIC LETTER REH, ARABIC LETTER ALEF MAKSURA, LATIN SMALL LETTER L, ARABIC LETTER LAM # →‎ریال‎→ +20C1 ; 0631 0649 006C 0644 ; MA #* ( ⃁ → ‎رىlل‎ ) SAUDI RIYAL SIGN → ARABIC LETTER REH, ARABIC LETTER ALEF MAKSURA, LATIN SMALL LETTER L, ARABIC LETTER LAM # →‎﷼‎→→‎ریال‎→ + +1EE06 ; 0632 ; MA # ( ‎𞸆‎ → ‎ز‎ ) ARABIC MATHEMATICAL ZAIN → ARABIC LETTER ZAIN # +1EE86 ; 0632 ; MA # ( ‎𞺆‎ → ‎ز‎ ) ARABIC MATHEMATICAL LOOPED ZAIN → ARABIC LETTER ZAIN # +1EEA6 ; 0632 ; MA # ( ‎𞺦‎ → ‎ز‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK ZAIN → ARABIC LETTER ZAIN # +FEB0 ; 0632 ; MA # ( ‎ﺰ‎ → ‎ز‎ ) ARABIC LETTER ZAIN FINAL FORM → ARABIC LETTER ZAIN # +FEAF ; 0632 ; MA # ( ‎ﺯ‎ → ‎ز‎ ) ARABIC LETTER ZAIN ISOLATED FORM → ARABIC LETTER ZAIN # + +08B2 ; 0632 0302 ; MA # ( ‎ࢲ‎ → ‎ز̂‎ ) ARABIC LETTER ZAIN WITH INVERTED V ABOVE → ARABIC LETTER ZAIN, COMBINING CIRCUMFLEX ACCENT # →‎زٛ‎→ + +0771 ; 0697 0615 ; MA # ( ‎ݱ‎ → ‎ڗؕ‎ ) ARABIC LETTER REH WITH SMALL ARABIC LETTER TAH AND TWO DOTS → ARABIC LETTER REH WITH TWO DOTS ABOVE, ARABIC SMALL HIGH TAH # + +1EE0E ; 0633 ; MA # ( ‎𞸎‎ → ‎س‎ ) ARABIC MATHEMATICAL SEEN → ARABIC LETTER SEEN # +1EE2E ; 0633 ; MA # ( ‎𞸮‎ → ‎س‎ ) ARABIC MATHEMATICAL INITIAL SEEN → ARABIC LETTER SEEN # +1EE4E ; 0633 ; MA # ( ‎𞹎‎ → ‎س‎ ) ARABIC MATHEMATICAL TAILED SEEN → ARABIC LETTER SEEN # +1EE6E ; 0633 ; MA # ( ‎𞹮‎ → ‎س‎ ) ARABIC MATHEMATICAL STRETCHED SEEN → ARABIC LETTER SEEN # +1EE8E ; 0633 ; MA # ( ‎𞺎‎ → ‎س‎ ) ARABIC MATHEMATICAL LOOPED SEEN → ARABIC LETTER SEEN # +1EEAE ; 0633 ; MA # ( ‎𞺮‎ → ‎س‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK SEEN → ARABIC LETTER SEEN # +FEB3 ; 0633 ; MA # ( ‎ﺳ‎ → ‎س‎ ) ARABIC LETTER SEEN INITIAL FORM → ARABIC LETTER SEEN # +FEB4 ; 0633 ; MA # ( ‎ﺴ‎ → ‎س‎ ) ARABIC LETTER SEEN MEDIAL FORM → ARABIC LETTER SEEN # +FEB2 ; 0633 ; MA # ( ‎ﺲ‎ → ‎س‎ ) ARABIC LETTER SEEN FINAL FORM → ARABIC LETTER SEEN # +FEB1 ; 0633 ; MA # ( ‎ﺱ‎ → ‎س‎ ) ARABIC LETTER SEEN ISOLATED FORM → ARABIC LETTER SEEN # + +0634 ; 0633 06DB ; MA # ( ‎ش‎ → ‎سۛ‎ ) ARABIC LETTER SHEEN → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS # +1EE14 ; 0633 06DB ; MA # ( ‎𞸔‎ → ‎سۛ‎ ) ARABIC MATHEMATICAL SHEEN → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS # →‎ش‎→ +1EE34 ; 0633 06DB ; MA # ( ‎𞸴‎ → ‎سۛ‎ ) ARABIC MATHEMATICAL INITIAL SHEEN → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS # →‎ش‎→ +1EE54 ; 0633 06DB ; MA # ( ‎𞹔‎ → ‎سۛ‎ ) ARABIC MATHEMATICAL TAILED SHEEN → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS # →‎ش‎→ +1EE74 ; 0633 06DB ; MA # ( ‎𞹴‎ → ‎سۛ‎ ) ARABIC MATHEMATICAL STRETCHED SHEEN → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS # →‎ش‎→ +1EE94 ; 0633 06DB ; MA # ( ‎𞺔‎ → ‎سۛ‎ ) ARABIC MATHEMATICAL LOOPED SHEEN → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS # →‎ش‎→ +1EEB4 ; 0633 06DB ; MA # ( ‎𞺴‎ → ‎سۛ‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK SHEEN → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS # →‎ش‎→ +FEB7 ; 0633 06DB ; MA # ( ‎ﺷ‎ → ‎سۛ‎ ) ARABIC LETTER SHEEN INITIAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS # →‎ش‎→ +FEB8 ; 0633 06DB ; MA # ( ‎ﺸ‎ → ‎سۛ‎ ) ARABIC LETTER SHEEN MEDIAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS # →‎ش‎→ +FEB6 ; 0633 06DB ; MA # ( ‎ﺶ‎ → ‎سۛ‎ ) ARABIC LETTER SHEEN FINAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS # →‎ش‎→ +FEB5 ; 0633 06DB ; MA # ( ‎ﺵ‎ → ‎سۛ‎ ) ARABIC LETTER SHEEN ISOLATED FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS # →‎ش‎→ + +077E ; 0633 0302 ; MA # ( ‎ݾ‎ → ‎س̂‎ ) ARABIC LETTER SEEN WITH INVERTED V → ARABIC LETTER SEEN, COMBINING CIRCUMFLEX ACCENT # →‎سٛ‎→ + +FD31 ; 0633 006F ; MA # ( ‎ﴱ‎ → ‎سo‎ ) ARABIC LIGATURE SEEN WITH HEH INITIAL FORM → ARABIC LETTER SEEN, LATIN SMALL LETTER O # →‎سه‎→ +FCE8 ; 0633 006F ; MA # ( ‎ﳨ‎ → ‎سo‎ ) ARABIC LIGATURE SEEN WITH HEH MEDIAL FORM → ARABIC LETTER SEEN, LATIN SMALL LETTER O # →‎سه‎→ + +FD32 ; 0633 06DB 006F ; MA # ( ‎ﴲ‎ → ‎سۛo‎ ) ARABIC LIGATURE SHEEN WITH HEH INITIAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, LATIN SMALL LETTER O # →‎شه‎→ +FCEA ; 0633 06DB 006F ; MA # ( ‎ﳪ‎ → ‎سۛo‎ ) ARABIC LIGATURE SHEEN WITH HEH MEDIAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, LATIN SMALL LETTER O # →‎شه‎→ + +FCAD ; 0633 062C ; MA # ( ‎ﲭ‎ → ‎سج‎ ) ARABIC LIGATURE SEEN WITH JEEM INITIAL FORM → ARABIC LETTER SEEN, ARABIC LETTER JEEM # +FD34 ; 0633 062C ; MA # ( ‎ﴴ‎ → ‎سج‎ ) ARABIC LIGATURE SEEN WITH JEEM MEDIAL FORM → ARABIC LETTER SEEN, ARABIC LETTER JEEM # +FC1C ; 0633 062C ; MA # ( ‎ﰜ‎ → ‎سج‎ ) ARABIC LIGATURE SEEN WITH JEEM ISOLATED FORM → ARABIC LETTER SEEN, ARABIC LETTER JEEM # + +FD2D ; 0633 06DB 062C ; MA # ( ‎ﴭ‎ → ‎سۛج‎ ) ARABIC LIGATURE SHEEN WITH JEEM INITIAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER JEEM # →‎شج‎→ +FD37 ; 0633 06DB 062C ; MA # ( ‎ﴷ‎ → ‎سۛج‎ ) ARABIC LIGATURE SHEEN WITH JEEM MEDIAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER JEEM # →‎شج‎→ +FD25 ; 0633 06DB 062C ; MA # ( ‎ﴥ‎ → ‎سۛج‎ ) ARABIC LIGATURE SHEEN WITH JEEM FINAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER JEEM # →‎شج‎→ +FD09 ; 0633 06DB 062C ; MA # ( ‎ﴉ‎ → ‎سۛج‎ ) ARABIC LIGATURE SHEEN WITH JEEM ISOLATED FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER JEEM # →‎شج‎→ + +FD5D ; 0633 062C 062D ; MA # ( ‎ﵝ‎ → ‎سجح‎ ) ARABIC LIGATURE SEEN WITH JEEM WITH HAH INITIAL FORM → ARABIC LETTER SEEN, ARABIC LETTER JEEM, ARABIC LETTER HAH # + +FD5E ; 0633 062C 0649 ; MA # ( ‎ﵞ‎ → ‎سجى‎ ) ARABIC LIGATURE SEEN WITH JEEM WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER SEEN, ARABIC LETTER JEEM, ARABIC LETTER ALEF MAKSURA # + +FD69 ; 0633 06DB 062C 0649 ; MA # ( ‎ﵩ‎ → ‎سۛجى‎ ) ARABIC LIGATURE SHEEN WITH JEEM WITH YEH FINAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER JEEM, ARABIC LETTER ALEF MAKSURA # →‎شجي‎→ + +FCAE ; 0633 062D ; MA # ( ‎ﲮ‎ → ‎سح‎ ) ARABIC LIGATURE SEEN WITH HAH INITIAL FORM → ARABIC LETTER SEEN, ARABIC LETTER HAH # +FD35 ; 0633 062D ; MA # ( ‎ﴵ‎ → ‎سح‎ ) ARABIC LIGATURE SEEN WITH HAH MEDIAL FORM → ARABIC LETTER SEEN, ARABIC LETTER HAH # +FC1D ; 0633 062D ; MA # ( ‎ﰝ‎ → ‎سح‎ ) ARABIC LIGATURE SEEN WITH HAH ISOLATED FORM → ARABIC LETTER SEEN, ARABIC LETTER HAH # + +FD2E ; 0633 06DB 062D ; MA # ( ‎ﴮ‎ → ‎سۛح‎ ) ARABIC LIGATURE SHEEN WITH HAH INITIAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER HAH # →‎شح‎→ +FD38 ; 0633 06DB 062D ; MA # ( ‎ﴸ‎ → ‎سۛح‎ ) ARABIC LIGATURE SHEEN WITH HAH MEDIAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER HAH # →‎شح‎→ +FD26 ; 0633 06DB 062D ; MA # ( ‎ﴦ‎ → ‎سۛح‎ ) ARABIC LIGATURE SHEEN WITH HAH FINAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER HAH # →‎شح‎→ +FD0A ; 0633 06DB 062D ; MA # ( ‎ﴊ‎ → ‎سۛح‎ ) ARABIC LIGATURE SHEEN WITH HAH ISOLATED FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER HAH # →‎شح‎→ + +FD5C ; 0633 062D 062C ; MA # ( ‎ﵜ‎ → ‎سحج‎ ) ARABIC LIGATURE SEEN WITH HAH WITH JEEM INITIAL FORM → ARABIC LETTER SEEN, ARABIC LETTER HAH, ARABIC LETTER JEEM # + +FD68 ; 0633 06DB 062D 0645 ; MA # ( ‎ﵨ‎ → ‎سۛحم‎ ) ARABIC LIGATURE SHEEN WITH HAH WITH MEEM INITIAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER HAH, ARABIC LETTER MEEM # →‎شحم‎→ +FD67 ; 0633 06DB 062D 0645 ; MA # ( ‎ﵧ‎ → ‎سۛحم‎ ) ARABIC LIGATURE SHEEN WITH HAH WITH MEEM FINAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER HAH, ARABIC LETTER MEEM # →‎شحم‎→ + +FDAA ; 0633 06DB 062D 0649 ; MA # ( ‎ﶪ‎ → ‎سۛحى‎ ) ARABIC LIGATURE SHEEN WITH HAH WITH YEH FINAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # →‎شحي‎→ + +FCAF ; 0633 062E ; MA # ( ‎ﲯ‎ → ‎سخ‎ ) ARABIC LIGATURE SEEN WITH KHAH INITIAL FORM → ARABIC LETTER SEEN, ARABIC LETTER KHAH # +FD36 ; 0633 062E ; MA # ( ‎ﴶ‎ → ‎سخ‎ ) ARABIC LIGATURE SEEN WITH KHAH MEDIAL FORM → ARABIC LETTER SEEN, ARABIC LETTER KHAH # +FC1E ; 0633 062E ; MA # ( ‎ﰞ‎ → ‎سخ‎ ) ARABIC LIGATURE SEEN WITH KHAH ISOLATED FORM → ARABIC LETTER SEEN, ARABIC LETTER KHAH # + +FD2F ; 0633 06DB 062E ; MA # ( ‎ﴯ‎ → ‎سۛخ‎ ) ARABIC LIGATURE SHEEN WITH KHAH INITIAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER KHAH # →‎شخ‎→ +FD39 ; 0633 06DB 062E ; MA # ( ‎ﴹ‎ → ‎سۛخ‎ ) ARABIC LIGATURE SHEEN WITH KHAH MEDIAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER KHAH # →‎شخ‎→ +FD27 ; 0633 06DB 062E ; MA # ( ‎ﴧ‎ → ‎سۛخ‎ ) ARABIC LIGATURE SHEEN WITH KHAH FINAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER KHAH # →‎شخ‎→ +FD0B ; 0633 06DB 062E ; MA # ( ‎ﴋ‎ → ‎سۛخ‎ ) ARABIC LIGATURE SHEEN WITH KHAH ISOLATED FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER KHAH # →‎شخ‎→ + +FDA8 ; 0633 062E 0649 ; MA # ( ‎ﶨ‎ → ‎سخى‎ ) ARABIC LIGATURE SEEN WITH KHAH WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER SEEN, ARABIC LETTER KHAH, ARABIC LETTER ALEF MAKSURA # +FDC6 ; 0633 062E 0649 ; MA # ( ‎ﷆ‎ → ‎سخى‎ ) ARABIC LIGATURE SEEN WITH KHAH WITH YEH FINAL FORM → ARABIC LETTER SEEN, ARABIC LETTER KHAH, ARABIC LETTER ALEF MAKSURA # →‎سخي‎→ + +FD2A ; 0633 0631 ; MA # ( ‎ﴪ‎ → ‎سر‎ ) ARABIC LIGATURE SEEN WITH REH FINAL FORM → ARABIC LETTER SEEN, ARABIC LETTER REH # +FD0E ; 0633 0631 ; MA # ( ‎ﴎ‎ → ‎سر‎ ) ARABIC LIGATURE SEEN WITH REH ISOLATED FORM → ARABIC LETTER SEEN, ARABIC LETTER REH # + +FD29 ; 0633 06DB 0631 ; MA # ( ‎ﴩ‎ → ‎سۛر‎ ) ARABIC LIGATURE SHEEN WITH REH FINAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER REH # →‎شر‎→ +FD0D ; 0633 06DB 0631 ; MA # ( ‎ﴍ‎ → ‎سۛر‎ ) ARABIC LIGATURE SHEEN WITH REH ISOLATED FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER REH # →‎شر‎→ + +FCB0 ; 0633 0645 ; MA # ( ‎ﲰ‎ → ‎سم‎ ) ARABIC LIGATURE SEEN WITH MEEM INITIAL FORM → ARABIC LETTER SEEN, ARABIC LETTER MEEM # +FCE7 ; 0633 0645 ; MA # ( ‎ﳧ‎ → ‎سم‎ ) ARABIC LIGATURE SEEN WITH MEEM MEDIAL FORM → ARABIC LETTER SEEN, ARABIC LETTER MEEM # +FC1F ; 0633 0645 ; MA # ( ‎ﰟ‎ → ‎سم‎ ) ARABIC LIGATURE SEEN WITH MEEM ISOLATED FORM → ARABIC LETTER SEEN, ARABIC LETTER MEEM # + +FD30 ; 0633 06DB 0645 ; MA # ( ‎ﴰ‎ → ‎سۛم‎ ) ARABIC LIGATURE SHEEN WITH MEEM INITIAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER MEEM # →‎شم‎→ +FCE9 ; 0633 06DB 0645 ; MA # ( ‎ﳩ‎ → ‎سۛم‎ ) ARABIC LIGATURE SHEEN WITH MEEM MEDIAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER MEEM # →‎شم‎→ +FD28 ; 0633 06DB 0645 ; MA # ( ‎ﴨ‎ → ‎سۛم‎ ) ARABIC LIGATURE SHEEN WITH MEEM FINAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER MEEM # →‎شم‎→ +FD0C ; 0633 06DB 0645 ; MA # ( ‎ﴌ‎ → ‎سۛم‎ ) ARABIC LIGATURE SHEEN WITH MEEM ISOLATED FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER MEEM # →‎شم‎→ + +FD61 ; 0633 0645 062C ; MA # ( ‎ﵡ‎ → ‎سمج‎ ) ARABIC LIGATURE SEEN WITH MEEM WITH JEEM INITIAL FORM → ARABIC LETTER SEEN, ARABIC LETTER MEEM, ARABIC LETTER JEEM # + +FD60 ; 0633 0645 062D ; MA # ( ‎ﵠ‎ → ‎سمح‎ ) ARABIC LIGATURE SEEN WITH MEEM WITH HAH INITIAL FORM → ARABIC LETTER SEEN, ARABIC LETTER MEEM, ARABIC LETTER HAH # +FD5F ; 0633 0645 062D ; MA # ( ‎ﵟ‎ → ‎سمح‎ ) ARABIC LIGATURE SEEN WITH MEEM WITH HAH FINAL FORM → ARABIC LETTER SEEN, ARABIC LETTER MEEM, ARABIC LETTER HAH # + +FD6B ; 0633 06DB 0645 062E ; MA # ( ‎ﵫ‎ → ‎سۛمخ‎ ) ARABIC LIGATURE SHEEN WITH MEEM WITH KHAH INITIAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER MEEM, ARABIC LETTER KHAH # →‎شمخ‎→ +FD6A ; 0633 06DB 0645 062E ; MA # ( ‎ﵪ‎ → ‎سۛمخ‎ ) ARABIC LIGATURE SHEEN WITH MEEM WITH KHAH FINAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER MEEM, ARABIC LETTER KHAH # →‎شمخ‎→ + +FD63 ; 0633 0645 0645 ; MA # ( ‎ﵣ‎ → ‎سمم‎ ) ARABIC LIGATURE SEEN WITH MEEM WITH MEEM INITIAL FORM → ARABIC LETTER SEEN, ARABIC LETTER MEEM, ARABIC LETTER MEEM # +FD62 ; 0633 0645 0645 ; MA # ( ‎ﵢ‎ → ‎سمم‎ ) ARABIC LIGATURE SEEN WITH MEEM WITH MEEM FINAL FORM → ARABIC LETTER SEEN, ARABIC LETTER MEEM, ARABIC LETTER MEEM # + +FD6D ; 0633 06DB 0645 0645 ; MA # ( ‎ﵭ‎ → ‎سۛمم‎ ) ARABIC LIGATURE SHEEN WITH MEEM WITH MEEM INITIAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER MEEM, ARABIC LETTER MEEM # →‎شمم‎→ +FD6C ; 0633 06DB 0645 0645 ; MA # ( ‎ﵬ‎ → ‎سۛمم‎ ) ARABIC LIGATURE SHEEN WITH MEEM WITH MEEM FINAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER MEEM, ARABIC LETTER MEEM # →‎شمم‎→ + +FD17 ; 0633 0649 ; MA # ( ‎ﴗ‎ → ‎سى‎ ) ARABIC LIGATURE SEEN WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER SEEN, ARABIC LETTER ALEF MAKSURA # +FCFB ; 0633 0649 ; MA # ( ‎ﳻ‎ → ‎سى‎ ) ARABIC LIGATURE SEEN WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER SEEN, ARABIC LETTER ALEF MAKSURA # +FD18 ; 0633 0649 ; MA # ( ‎ﴘ‎ → ‎سى‎ ) ARABIC LIGATURE SEEN WITH YEH FINAL FORM → ARABIC LETTER SEEN, ARABIC LETTER ALEF MAKSURA # →‎سي‎→ +FCFC ; 0633 0649 ; MA # ( ‎ﳼ‎ → ‎سى‎ ) ARABIC LIGATURE SEEN WITH YEH ISOLATED FORM → ARABIC LETTER SEEN, ARABIC LETTER ALEF MAKSURA # →‎سي‎→ + +FD19 ; 0633 06DB 0649 ; MA # ( ‎ﴙ‎ → ‎سۛى‎ ) ARABIC LIGATURE SHEEN WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER ALEF MAKSURA # →‎شى‎→ +FCFD ; 0633 06DB 0649 ; MA # ( ‎ﳽ‎ → ‎سۛى‎ ) ARABIC LIGATURE SHEEN WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER ALEF MAKSURA # →‎شى‎→ +FD1A ; 0633 06DB 0649 ; MA # ( ‎ﴚ‎ → ‎سۛى‎ ) ARABIC LIGATURE SHEEN WITH YEH FINAL FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER ALEF MAKSURA # →‎شي‎→ +FCFE ; 0633 06DB 0649 ; MA # ( ‎ﳾ‎ → ‎سۛى‎ ) ARABIC LIGATURE SHEEN WITH YEH ISOLATED FORM → ARABIC LETTER SEEN, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER ALEF MAKSURA # →‎شي‎→ + +102F2 ; 0635 ; MA #* ( 𐋲 → ‎ص‎ ) COPTIC EPACT NUMBER NINETY → ARABIC LETTER SAD # +1EE11 ; 0635 ; MA # ( ‎𞸑‎ → ‎ص‎ ) ARABIC MATHEMATICAL SAD → ARABIC LETTER SAD # +1EE31 ; 0635 ; MA # ( ‎𞸱‎ → ‎ص‎ ) ARABIC MATHEMATICAL INITIAL SAD → ARABIC LETTER SAD # +1EE51 ; 0635 ; MA # ( ‎𞹑‎ → ‎ص‎ ) ARABIC MATHEMATICAL TAILED SAD → ARABIC LETTER SAD # +1EE71 ; 0635 ; MA # ( ‎𞹱‎ → ‎ص‎ ) ARABIC MATHEMATICAL STRETCHED SAD → ARABIC LETTER SAD # +1EE91 ; 0635 ; MA # ( ‎𞺑‎ → ‎ص‎ ) ARABIC MATHEMATICAL LOOPED SAD → ARABIC LETTER SAD # +1EEB1 ; 0635 ; MA # ( ‎𞺱‎ → ‎ص‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK SAD → ARABIC LETTER SAD # +FEBB ; 0635 ; MA # ( ‎ﺻ‎ → ‎ص‎ ) ARABIC LETTER SAD INITIAL FORM → ARABIC LETTER SAD # +FEBC ; 0635 ; MA # ( ‎ﺼ‎ → ‎ص‎ ) ARABIC LETTER SAD MEDIAL FORM → ARABIC LETTER SAD # +FEBA ; 0635 ; MA # ( ‎ﺺ‎ → ‎ص‎ ) ARABIC LETTER SAD FINAL FORM → ARABIC LETTER SAD # +FEB9 ; 0635 ; MA # ( ‎ﺹ‎ → ‎ص‎ ) ARABIC LETTER SAD ISOLATED FORM → ARABIC LETTER SAD # + +069E ; 0635 06DB ; MA # ( ‎ڞ‎ → ‎صۛ‎ ) ARABIC LETTER SAD WITH THREE DOTS ABOVE → ARABIC LETTER SAD, ARABIC SMALL HIGH THREE DOTS # + +08AF ; 0635 0324 0323 ; MA # ( ‎ࢯ‎ → ‎ص̤̣‎ ) ARABIC LETTER SAD WITH THREE DOTS BELOW → ARABIC LETTER SAD, COMBINING DIAERESIS BELOW, COMBINING DOT BELOW # →‎ص࣮࣭‎→ + +FCB1 ; 0635 062D ; MA # ( ‎ﲱ‎ → ‎صح‎ ) ARABIC LIGATURE SAD WITH HAH INITIAL FORM → ARABIC LETTER SAD, ARABIC LETTER HAH # +FC20 ; 0635 062D ; MA # ( ‎ﰠ‎ → ‎صح‎ ) ARABIC LIGATURE SAD WITH HAH ISOLATED FORM → ARABIC LETTER SAD, ARABIC LETTER HAH # + +FD65 ; 0635 062D 062D ; MA # ( ‎ﵥ‎ → ‎صحح‎ ) ARABIC LIGATURE SAD WITH HAH WITH HAH INITIAL FORM → ARABIC LETTER SAD, ARABIC LETTER HAH, ARABIC LETTER HAH # +FD64 ; 0635 062D 062D ; MA # ( ‎ﵤ‎ → ‎صحح‎ ) ARABIC LIGATURE SAD WITH HAH WITH HAH FINAL FORM → ARABIC LETTER SAD, ARABIC LETTER HAH, ARABIC LETTER HAH # + +FDA9 ; 0635 062D 0649 ; MA # ( ‎ﶩ‎ → ‎صحى‎ ) ARABIC LIGATURE SAD WITH HAH WITH YEH FINAL FORM → ARABIC LETTER SAD, ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # →‎صحي‎→ + +FCB2 ; 0635 062E ; MA # ( ‎ﲲ‎ → ‎صخ‎ ) ARABIC LIGATURE SAD WITH KHAH INITIAL FORM → ARABIC LETTER SAD, ARABIC LETTER KHAH # + +FD2B ; 0635 0631 ; MA # ( ‎ﴫ‎ → ‎صر‎ ) ARABIC LIGATURE SAD WITH REH FINAL FORM → ARABIC LETTER SAD, ARABIC LETTER REH # +FD0F ; 0635 0631 ; MA # ( ‎ﴏ‎ → ‎صر‎ ) ARABIC LIGATURE SAD WITH REH ISOLATED FORM → ARABIC LETTER SAD, ARABIC LETTER REH # + +FDF5 ; 0635 0644 0639 0645 ; MA # ( ‎ﷵ‎ → ‎صلعم‎ ) ARABIC LIGATURE SALAM ISOLATED FORM → ARABIC LETTER SAD, ARABIC LETTER LAM, ARABIC LETTER AIN, ARABIC LETTER MEEM # + +FDF9 ; 0635 0644 0649 ; MA # ( ‎ﷹ‎ → ‎صلى‎ ) ARABIC LIGATURE SALLA ISOLATED FORM → ARABIC LETTER SAD, ARABIC LETTER LAM, ARABIC LETTER ALEF MAKSURA # +FDF0 ; 0635 0644 0649 ; MA # ( ‎ﷰ‎ → ‎صلى‎ ) ARABIC LIGATURE SALLA USED AS KORANIC STOP SIGN ISOLATED FORM → ARABIC LETTER SAD, ARABIC LETTER LAM, ARABIC LETTER ALEF MAKSURA # →‎صلے‎→ + +FDFA ; 0635 0644 0649 0020 006C 0644 0644 006F 0020 0639 0644 0649 006F 0020 0648 0633 0644 0645 ; MA #* ( ‎ﷺ‎ → ‎صلى lللo علىo وسلم‎ ) ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM → ARABIC LETTER SAD, ARABIC LETTER LAM, ARABIC LETTER ALEF MAKSURA, SPACE, LATIN SMALL LETTER L, ARABIC LETTER LAM, ARABIC LETTER LAM, LATIN SMALL LETTER O, SPACE, ARABIC LETTER AIN, ARABIC LETTER LAM, ARABIC LETTER ALEF MAKSURA, LATIN SMALL LETTER O, SPACE, ARABIC LETTER WAW, ARABIC LETTER SEEN, ARABIC LETTER LAM, ARABIC LETTER MEEM # →‎صلى الله عليه وسلم‎→ + +FCB3 ; 0635 0645 ; MA # ( ‎ﲳ‎ → ‎صم‎ ) ARABIC LIGATURE SAD WITH MEEM INITIAL FORM → ARABIC LETTER SAD, ARABIC LETTER MEEM # +FC21 ; 0635 0645 ; MA # ( ‎ﰡ‎ → ‎صم‎ ) ARABIC LIGATURE SAD WITH MEEM ISOLATED FORM → ARABIC LETTER SAD, ARABIC LETTER MEEM # + +FDC5 ; 0635 0645 0645 ; MA # ( ‎ﷅ‎ → ‎صمم‎ ) ARABIC LIGATURE SAD WITH MEEM WITH MEEM INITIAL FORM → ARABIC LETTER SAD, ARABIC LETTER MEEM, ARABIC LETTER MEEM # +FD66 ; 0635 0645 0645 ; MA # ( ‎ﵦ‎ → ‎صمم‎ ) ARABIC LIGATURE SAD WITH MEEM WITH MEEM FINAL FORM → ARABIC LETTER SAD, ARABIC LETTER MEEM, ARABIC LETTER MEEM # + +FD21 ; 0635 0649 ; MA # ( ‎ﴡ‎ → ‎صى‎ ) ARABIC LIGATURE SAD WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER SAD, ARABIC LETTER ALEF MAKSURA # +FD05 ; 0635 0649 ; MA # ( ‎ﴅ‎ → ‎صى‎ ) ARABIC LIGATURE SAD WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER SAD, ARABIC LETTER ALEF MAKSURA # +FD22 ; 0635 0649 ; MA # ( ‎ﴢ‎ → ‎صى‎ ) ARABIC LIGATURE SAD WITH YEH FINAL FORM → ARABIC LETTER SAD, ARABIC LETTER ALEF MAKSURA # →‎صي‎→ +FD06 ; 0635 0649 ; MA # ( ‎ﴆ‎ → ‎صى‎ ) ARABIC LIGATURE SAD WITH YEH ISOLATED FORM → ARABIC LETTER SAD, ARABIC LETTER ALEF MAKSURA # →‎صي‎→ + +1EE19 ; 0636 ; MA # ( ‎𞸙‎ → ‎ض‎ ) ARABIC MATHEMATICAL DAD → ARABIC LETTER DAD # +1EE39 ; 0636 ; MA # ( ‎𞸹‎ → ‎ض‎ ) ARABIC MATHEMATICAL INITIAL DAD → ARABIC LETTER DAD # +1EE59 ; 0636 ; MA # ( ‎𞹙‎ → ‎ض‎ ) ARABIC MATHEMATICAL TAILED DAD → ARABIC LETTER DAD # +1EE79 ; 0636 ; MA # ( ‎𞹹‎ → ‎ض‎ ) ARABIC MATHEMATICAL STRETCHED DAD → ARABIC LETTER DAD # +1EE99 ; 0636 ; MA # ( ‎𞺙‎ → ‎ض‎ ) ARABIC MATHEMATICAL LOOPED DAD → ARABIC LETTER DAD # +1EEB9 ; 0636 ; MA # ( ‎𞺹‎ → ‎ض‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK DAD → ARABIC LETTER DAD # +FEBF ; 0636 ; MA # ( ‎ﺿ‎ → ‎ض‎ ) ARABIC LETTER DAD INITIAL FORM → ARABIC LETTER DAD # +FEC0 ; 0636 ; MA # ( ‎ﻀ‎ → ‎ض‎ ) ARABIC LETTER DAD MEDIAL FORM → ARABIC LETTER DAD # +FEBE ; 0636 ; MA # ( ‎ﺾ‎ → ‎ض‎ ) ARABIC LETTER DAD FINAL FORM → ARABIC LETTER DAD # +FEBD ; 0636 ; MA # ( ‎ﺽ‎ → ‎ض‎ ) ARABIC LETTER DAD ISOLATED FORM → ARABIC LETTER DAD # + +FCB4 ; 0636 062C ; MA # ( ‎ﲴ‎ → ‎ضج‎ ) ARABIC LIGATURE DAD WITH JEEM INITIAL FORM → ARABIC LETTER DAD, ARABIC LETTER JEEM # +FC22 ; 0636 062C ; MA # ( ‎ﰢ‎ → ‎ضج‎ ) ARABIC LIGATURE DAD WITH JEEM ISOLATED FORM → ARABIC LETTER DAD, ARABIC LETTER JEEM # + +FCB5 ; 0636 062D ; MA # ( ‎ﲵ‎ → ‎ضح‎ ) ARABIC LIGATURE DAD WITH HAH INITIAL FORM → ARABIC LETTER DAD, ARABIC LETTER HAH # +FC23 ; 0636 062D ; MA # ( ‎ﰣ‎ → ‎ضح‎ ) ARABIC LIGATURE DAD WITH HAH ISOLATED FORM → ARABIC LETTER DAD, ARABIC LETTER HAH # + +FD6E ; 0636 062D 0649 ; MA # ( ‎ﵮ‎ → ‎ضحى‎ ) ARABIC LIGATURE DAD WITH HAH WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER DAD, ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # +FDAB ; 0636 062D 0649 ; MA # ( ‎ﶫ‎ → ‎ضحى‎ ) ARABIC LIGATURE DAD WITH HAH WITH YEH FINAL FORM → ARABIC LETTER DAD, ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # →‎ضحي‎→ + +FCB6 ; 0636 062E ; MA # ( ‎ﲶ‎ → ‎ضخ‎ ) ARABIC LIGATURE DAD WITH KHAH INITIAL FORM → ARABIC LETTER DAD, ARABIC LETTER KHAH # +FC24 ; 0636 062E ; MA # ( ‎ﰤ‎ → ‎ضخ‎ ) ARABIC LIGATURE DAD WITH KHAH ISOLATED FORM → ARABIC LETTER DAD, ARABIC LETTER KHAH # + +FD70 ; 0636 062E 0645 ; MA # ( ‎ﵰ‎ → ‎ضخم‎ ) ARABIC LIGATURE DAD WITH KHAH WITH MEEM INITIAL FORM → ARABIC LETTER DAD, ARABIC LETTER KHAH, ARABIC LETTER MEEM # +FD6F ; 0636 062E 0645 ; MA # ( ‎ﵯ‎ → ‎ضخم‎ ) ARABIC LIGATURE DAD WITH KHAH WITH MEEM FINAL FORM → ARABIC LETTER DAD, ARABIC LETTER KHAH, ARABIC LETTER MEEM # + +FD2C ; 0636 0631 ; MA # ( ‎ﴬ‎ → ‎ضر‎ ) ARABIC LIGATURE DAD WITH REH FINAL FORM → ARABIC LETTER DAD, ARABIC LETTER REH # +FD10 ; 0636 0631 ; MA # ( ‎ﴐ‎ → ‎ضر‎ ) ARABIC LIGATURE DAD WITH REH ISOLATED FORM → ARABIC LETTER DAD, ARABIC LETTER REH # + +FCB7 ; 0636 0645 ; MA # ( ‎ﲷ‎ → ‎ضم‎ ) ARABIC LIGATURE DAD WITH MEEM INITIAL FORM → ARABIC LETTER DAD, ARABIC LETTER MEEM # +FC25 ; 0636 0645 ; MA # ( ‎ﰥ‎ → ‎ضم‎ ) ARABIC LIGATURE DAD WITH MEEM ISOLATED FORM → ARABIC LETTER DAD, ARABIC LETTER MEEM # + +FD23 ; 0636 0649 ; MA # ( ‎ﴣ‎ → ‎ضى‎ ) ARABIC LIGATURE DAD WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER DAD, ARABIC LETTER ALEF MAKSURA # +FD07 ; 0636 0649 ; MA # ( ‎ﴇ‎ → ‎ضى‎ ) ARABIC LIGATURE DAD WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER DAD, ARABIC LETTER ALEF MAKSURA # +FD24 ; 0636 0649 ; MA # ( ‎ﴤ‎ → ‎ضى‎ ) ARABIC LIGATURE DAD WITH YEH FINAL FORM → ARABIC LETTER DAD, ARABIC LETTER ALEF MAKSURA # →‎ضي‎→ +FD08 ; 0636 0649 ; MA # ( ‎ﴈ‎ → ‎ضى‎ ) ARABIC LIGATURE DAD WITH YEH ISOLATED FORM → ARABIC LETTER DAD, ARABIC LETTER ALEF MAKSURA # →‎ضي‎→ + +102E8 ; 0637 ; MA #* ( 𐋨 → ‎ط‎ ) COPTIC EPACT DIGIT EIGHT → ARABIC LETTER TAH # +1EE08 ; 0637 ; MA # ( ‎𞸈‎ → ‎ط‎ ) ARABIC MATHEMATICAL TAH → ARABIC LETTER TAH # +1EE68 ; 0637 ; MA # ( ‎𞹨‎ → ‎ط‎ ) ARABIC MATHEMATICAL STRETCHED TAH → ARABIC LETTER TAH # +1EE88 ; 0637 ; MA # ( ‎𞺈‎ → ‎ط‎ ) ARABIC MATHEMATICAL LOOPED TAH → ARABIC LETTER TAH # +1EEA8 ; 0637 ; MA # ( ‎𞺨‎ → ‎ط‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK TAH → ARABIC LETTER TAH # +FEC3 ; 0637 ; MA # ( ‎ﻃ‎ → ‎ط‎ ) ARABIC LETTER TAH INITIAL FORM → ARABIC LETTER TAH # +FEC4 ; 0637 ; MA # ( ‎ﻄ‎ → ‎ط‎ ) ARABIC LETTER TAH MEDIAL FORM → ARABIC LETTER TAH # +FEC2 ; 0637 ; MA # ( ‎ﻂ‎ → ‎ط‎ ) ARABIC LETTER TAH FINAL FORM → ARABIC LETTER TAH # +FEC1 ; 0637 ; MA # ( ‎ﻁ‎ → ‎ط‎ ) ARABIC LETTER TAH ISOLATED FORM → ARABIC LETTER TAH # + +069F ; 0637 06DB ; MA # ( ‎ڟ‎ → ‎طۛ‎ ) ARABIC LETTER TAH WITH THREE DOTS ABOVE → ARABIC LETTER TAH, ARABIC SMALL HIGH THREE DOTS # + +FCB8 ; 0637 062D ; MA # ( ‎ﲸ‎ → ‎طح‎ ) ARABIC LIGATURE TAH WITH HAH INITIAL FORM → ARABIC LETTER TAH, ARABIC LETTER HAH # +FC26 ; 0637 062D ; MA # ( ‎ﰦ‎ → ‎طح‎ ) ARABIC LIGATURE TAH WITH HAH ISOLATED FORM → ARABIC LETTER TAH, ARABIC LETTER HAH # + +FD33 ; 0637 0645 ; MA # ( ‎ﴳ‎ → ‎طم‎ ) ARABIC LIGATURE TAH WITH MEEM INITIAL FORM → ARABIC LETTER TAH, ARABIC LETTER MEEM # +FD3A ; 0637 0645 ; MA # ( ‎ﴺ‎ → ‎طم‎ ) ARABIC LIGATURE TAH WITH MEEM MEDIAL FORM → ARABIC LETTER TAH, ARABIC LETTER MEEM # +FC27 ; 0637 0645 ; MA # ( ‎ﰧ‎ → ‎طم‎ ) ARABIC LIGATURE TAH WITH MEEM ISOLATED FORM → ARABIC LETTER TAH, ARABIC LETTER MEEM # + +FD72 ; 0637 0645 062D ; MA # ( ‎ﵲ‎ → ‎طمح‎ ) ARABIC LIGATURE TAH WITH MEEM WITH HAH INITIAL FORM → ARABIC LETTER TAH, ARABIC LETTER MEEM, ARABIC LETTER HAH # +FD71 ; 0637 0645 062D ; MA # ( ‎ﵱ‎ → ‎طمح‎ ) ARABIC LIGATURE TAH WITH MEEM WITH HAH FINAL FORM → ARABIC LETTER TAH, ARABIC LETTER MEEM, ARABIC LETTER HAH # + +FD73 ; 0637 0645 0645 ; MA # ( ‎ﵳ‎ → ‎طمم‎ ) ARABIC LIGATURE TAH WITH MEEM WITH MEEM INITIAL FORM → ARABIC LETTER TAH, ARABIC LETTER MEEM, ARABIC LETTER MEEM # + +FD74 ; 0637 0645 0649 ; MA # ( ‎ﵴ‎ → ‎طمى‎ ) ARABIC LIGATURE TAH WITH MEEM WITH YEH FINAL FORM → ARABIC LETTER TAH, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # →‎طمي‎→ + +FD11 ; 0637 0649 ; MA # ( ‎ﴑ‎ → ‎طى‎ ) ARABIC LIGATURE TAH WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER TAH, ARABIC LETTER ALEF MAKSURA # +FCF5 ; 0637 0649 ; MA # ( ‎ﳵ‎ → ‎طى‎ ) ARABIC LIGATURE TAH WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER TAH, ARABIC LETTER ALEF MAKSURA # +FD12 ; 0637 0649 ; MA # ( ‎ﴒ‎ → ‎طى‎ ) ARABIC LIGATURE TAH WITH YEH FINAL FORM → ARABIC LETTER TAH, ARABIC LETTER ALEF MAKSURA # →‎طي‎→ +FCF6 ; 0637 0649 ; MA # ( ‎ﳶ‎ → ‎طى‎ ) ARABIC LIGATURE TAH WITH YEH ISOLATED FORM → ARABIC LETTER TAH, ARABIC LETTER ALEF MAKSURA # →‎طي‎→ + +1EE1A ; 0638 ; MA # ( ‎𞸚‎ → ‎ظ‎ ) ARABIC MATHEMATICAL ZAH → ARABIC LETTER ZAH # +1EE7A ; 0638 ; MA # ( ‎𞹺‎ → ‎ظ‎ ) ARABIC MATHEMATICAL STRETCHED ZAH → ARABIC LETTER ZAH # +1EE9A ; 0638 ; MA # ( ‎𞺚‎ → ‎ظ‎ ) ARABIC MATHEMATICAL LOOPED ZAH → ARABIC LETTER ZAH # +1EEBA ; 0638 ; MA # ( ‎𞺺‎ → ‎ظ‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK ZAH → ARABIC LETTER ZAH # +FEC7 ; 0638 ; MA # ( ‎ﻇ‎ → ‎ظ‎ ) ARABIC LETTER ZAH INITIAL FORM → ARABIC LETTER ZAH # +FEC8 ; 0638 ; MA # ( ‎ﻈ‎ → ‎ظ‎ ) ARABIC LETTER ZAH MEDIAL FORM → ARABIC LETTER ZAH # +FEC6 ; 0638 ; MA # ( ‎ﻆ‎ → ‎ظ‎ ) ARABIC LETTER ZAH FINAL FORM → ARABIC LETTER ZAH # +FEC5 ; 0638 ; MA # ( ‎ﻅ‎ → ‎ظ‎ ) ARABIC LETTER ZAH ISOLATED FORM → ARABIC LETTER ZAH # + +FCB9 ; 0638 0645 ; MA # ( ‎ﲹ‎ → ‎ظم‎ ) ARABIC LIGATURE ZAH WITH MEEM INITIAL FORM → ARABIC LETTER ZAH, ARABIC LETTER MEEM # +FD3B ; 0638 0645 ; MA # ( ‎ﴻ‎ → ‎ظم‎ ) ARABIC LIGATURE ZAH WITH MEEM MEDIAL FORM → ARABIC LETTER ZAH, ARABIC LETTER MEEM # +FC28 ; 0638 0645 ; MA # ( ‎ﰨ‎ → ‎ظم‎ ) ARABIC LIGATURE ZAH WITH MEEM ISOLATED FORM → ARABIC LETTER ZAH, ARABIC LETTER MEEM # + +060F ; 0639 ; MA #* ( ؏ → ‎ع‎ ) ARABIC SIGN MISRA → ARABIC LETTER AIN # +1EE0F ; 0639 ; MA # ( ‎𞸏‎ → ‎ع‎ ) ARABIC MATHEMATICAL AIN → ARABIC LETTER AIN # +1EE2F ; 0639 ; MA # ( ‎𞸯‎ → ‎ع‎ ) ARABIC MATHEMATICAL INITIAL AIN → ARABIC LETTER AIN # +1EE4F ; 0639 ; MA # ( ‎𞹏‎ → ‎ع‎ ) ARABIC MATHEMATICAL TAILED AIN → ARABIC LETTER AIN # +1EE6F ; 0639 ; MA # ( ‎𞹯‎ → ‎ع‎ ) ARABIC MATHEMATICAL STRETCHED AIN → ARABIC LETTER AIN # +1EE8F ; 0639 ; MA # ( ‎𞺏‎ → ‎ع‎ ) ARABIC MATHEMATICAL LOOPED AIN → ARABIC LETTER AIN # +1EEAF ; 0639 ; MA # ( ‎𞺯‎ → ‎ع‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK AIN → ARABIC LETTER AIN # +FECB ; 0639 ; MA # ( ‎ﻋ‎ → ‎ع‎ ) ARABIC LETTER AIN INITIAL FORM → ARABIC LETTER AIN # +FECC ; 0639 ; MA # ( ‎ﻌ‎ → ‎ع‎ ) ARABIC LETTER AIN MEDIAL FORM → ARABIC LETTER AIN # +FECA ; 0639 ; MA # ( ‎ﻊ‎ → ‎ع‎ ) ARABIC LETTER AIN FINAL FORM → ARABIC LETTER AIN # +FEC9 ; 0639 ; MA # ( ‎ﻉ‎ → ‎ع‎ ) ARABIC LETTER AIN ISOLATED FORM → ARABIC LETTER AIN # + +FCBA ; 0639 062C ; MA # ( ‎ﲺ‎ → ‎عج‎ ) ARABIC LIGATURE AIN WITH JEEM INITIAL FORM → ARABIC LETTER AIN, ARABIC LETTER JEEM # +FC29 ; 0639 062C ; MA # ( ‎ﰩ‎ → ‎عج‎ ) ARABIC LIGATURE AIN WITH JEEM ISOLATED FORM → ARABIC LETTER AIN, ARABIC LETTER JEEM # + +FDC4 ; 0639 062C 0645 ; MA # ( ‎ﷄ‎ → ‎عجم‎ ) ARABIC LIGATURE AIN WITH JEEM WITH MEEM INITIAL FORM → ARABIC LETTER AIN, ARABIC LETTER JEEM, ARABIC LETTER MEEM # +FD75 ; 0639 062C 0645 ; MA # ( ‎ﵵ‎ → ‎عجم‎ ) ARABIC LIGATURE AIN WITH JEEM WITH MEEM FINAL FORM → ARABIC LETTER AIN, ARABIC LETTER JEEM, ARABIC LETTER MEEM # + +FDF7 ; 0639 0644 0649 006F ; MA # ( ‎ﷷ‎ → ‎علىo‎ ) ARABIC LIGATURE ALAYHE ISOLATED FORM → ARABIC LETTER AIN, ARABIC LETTER LAM, ARABIC LETTER ALEF MAKSURA, LATIN SMALL LETTER O # →‎عليه‎→ + +FCBB ; 0639 0645 ; MA # ( ‎ﲻ‎ → ‎عم‎ ) ARABIC LIGATURE AIN WITH MEEM INITIAL FORM → ARABIC LETTER AIN, ARABIC LETTER MEEM # +FC2A ; 0639 0645 ; MA # ( ‎ﰪ‎ → ‎عم‎ ) ARABIC LIGATURE AIN WITH MEEM ISOLATED FORM → ARABIC LETTER AIN, ARABIC LETTER MEEM # + +FD77 ; 0639 0645 0645 ; MA # ( ‎ﵷ‎ → ‎عمم‎ ) ARABIC LIGATURE AIN WITH MEEM WITH MEEM INITIAL FORM → ARABIC LETTER AIN, ARABIC LETTER MEEM, ARABIC LETTER MEEM # +FD76 ; 0639 0645 0645 ; MA # ( ‎ﵶ‎ → ‎عمم‎ ) ARABIC LIGATURE AIN WITH MEEM WITH MEEM FINAL FORM → ARABIC LETTER AIN, ARABIC LETTER MEEM, ARABIC LETTER MEEM # + +FD78 ; 0639 0645 0649 ; MA # ( ‎ﵸ‎ → ‎عمى‎ ) ARABIC LIGATURE AIN WITH MEEM WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER AIN, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # +FDB6 ; 0639 0645 0649 ; MA # ( ‎ﶶ‎ → ‎عمى‎ ) ARABIC LIGATURE AIN WITH MEEM WITH YEH FINAL FORM → ARABIC LETTER AIN, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # →‎عمي‎→ + +FD13 ; 0639 0649 ; MA # ( ‎ﴓ‎ → ‎عى‎ ) ARABIC LIGATURE AIN WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER AIN, ARABIC LETTER ALEF MAKSURA # +FCF7 ; 0639 0649 ; MA # ( ‎ﳷ‎ → ‎عى‎ ) ARABIC LIGATURE AIN WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER AIN, ARABIC LETTER ALEF MAKSURA # +FD14 ; 0639 0649 ; MA # ( ‎ﴔ‎ → ‎عى‎ ) ARABIC LIGATURE AIN WITH YEH FINAL FORM → ARABIC LETTER AIN, ARABIC LETTER ALEF MAKSURA # →‎عي‎→ +FCF8 ; 0639 0649 ; MA # ( ‎ﳸ‎ → ‎عى‎ ) ARABIC LIGATURE AIN WITH YEH ISOLATED FORM → ARABIC LETTER AIN, ARABIC LETTER ALEF MAKSURA # →‎عي‎→ + +1EE1B ; 063A ; MA # ( ‎𞸛‎ → ‎غ‎ ) ARABIC MATHEMATICAL GHAIN → ARABIC LETTER GHAIN # +1EE3B ; 063A ; MA # ( ‎𞸻‎ → ‎غ‎ ) ARABIC MATHEMATICAL INITIAL GHAIN → ARABIC LETTER GHAIN # +1EE5B ; 063A ; MA # ( ‎𞹛‎ → ‎غ‎ ) ARABIC MATHEMATICAL TAILED GHAIN → ARABIC LETTER GHAIN # +1EE7B ; 063A ; MA # ( ‎𞹻‎ → ‎غ‎ ) ARABIC MATHEMATICAL STRETCHED GHAIN → ARABIC LETTER GHAIN # +1EE9B ; 063A ; MA # ( ‎𞺛‎ → ‎غ‎ ) ARABIC MATHEMATICAL LOOPED GHAIN → ARABIC LETTER GHAIN # +1EEBB ; 063A ; MA # ( ‎𞺻‎ → ‎غ‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK GHAIN → ARABIC LETTER GHAIN # +FECF ; 063A ; MA # ( ‎ﻏ‎ → ‎غ‎ ) ARABIC LETTER GHAIN INITIAL FORM → ARABIC LETTER GHAIN # +FED0 ; 063A ; MA # ( ‎ﻐ‎ → ‎غ‎ ) ARABIC LETTER GHAIN MEDIAL FORM → ARABIC LETTER GHAIN # +FECE ; 063A ; MA # ( ‎ﻎ‎ → ‎غ‎ ) ARABIC LETTER GHAIN FINAL FORM → ARABIC LETTER GHAIN # +FECD ; 063A ; MA # ( ‎ﻍ‎ → ‎غ‎ ) ARABIC LETTER GHAIN ISOLATED FORM → ARABIC LETTER GHAIN # + +FCBC ; 063A 062C ; MA # ( ‎ﲼ‎ → ‎غج‎ ) ARABIC LIGATURE GHAIN WITH JEEM INITIAL FORM → ARABIC LETTER GHAIN, ARABIC LETTER JEEM # +FC2B ; 063A 062C ; MA # ( ‎ﰫ‎ → ‎غج‎ ) ARABIC LIGATURE GHAIN WITH JEEM ISOLATED FORM → ARABIC LETTER GHAIN, ARABIC LETTER JEEM # + +FCBD ; 063A 0645 ; MA # ( ‎ﲽ‎ → ‎غم‎ ) ARABIC LIGATURE GHAIN WITH MEEM INITIAL FORM → ARABIC LETTER GHAIN, ARABIC LETTER MEEM # +FC2C ; 063A 0645 ; MA # ( ‎ﰬ‎ → ‎غم‎ ) ARABIC LIGATURE GHAIN WITH MEEM ISOLATED FORM → ARABIC LETTER GHAIN, ARABIC LETTER MEEM # + +FD79 ; 063A 0645 0645 ; MA # ( ‎ﵹ‎ → ‎غمم‎ ) ARABIC LIGATURE GHAIN WITH MEEM WITH MEEM FINAL FORM → ARABIC LETTER GHAIN, ARABIC LETTER MEEM, ARABIC LETTER MEEM # + +FD7B ; 063A 0645 0649 ; MA # ( ‎ﵻ‎ → ‎غمى‎ ) ARABIC LIGATURE GHAIN WITH MEEM WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER GHAIN, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # +FD7A ; 063A 0645 0649 ; MA # ( ‎ﵺ‎ → ‎غمى‎ ) ARABIC LIGATURE GHAIN WITH MEEM WITH YEH FINAL FORM → ARABIC LETTER GHAIN, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # →‎غمي‎→ + +FD15 ; 063A 0649 ; MA # ( ‎ﴕ‎ → ‎غى‎ ) ARABIC LIGATURE GHAIN WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER GHAIN, ARABIC LETTER ALEF MAKSURA # +FCF9 ; 063A 0649 ; MA # ( ‎ﳹ‎ → ‎غى‎ ) ARABIC LIGATURE GHAIN WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER GHAIN, ARABIC LETTER ALEF MAKSURA # +FD16 ; 063A 0649 ; MA # ( ‎ﴖ‎ → ‎غى‎ ) ARABIC LIGATURE GHAIN WITH YEH FINAL FORM → ARABIC LETTER GHAIN, ARABIC LETTER ALEF MAKSURA # →‎غي‎→ +FCFA ; 063A 0649 ; MA # ( ‎ﳺ‎ → ‎غى‎ ) ARABIC LIGATURE GHAIN WITH YEH ISOLATED FORM → ARABIC LETTER GHAIN, ARABIC LETTER ALEF MAKSURA # →‎غي‎→ + +1EE10 ; 0641 ; MA # ( ‎𞸐‎ → ‎ف‎ ) ARABIC MATHEMATICAL FEH → ARABIC LETTER FEH # +1EE30 ; 0641 ; MA # ( ‎𞸰‎ → ‎ف‎ ) ARABIC MATHEMATICAL INITIAL FEH → ARABIC LETTER FEH # +1EE70 ; 0641 ; MA # ( ‎𞹰‎ → ‎ف‎ ) ARABIC MATHEMATICAL STRETCHED FEH → ARABIC LETTER FEH # +1EE90 ; 0641 ; MA # ( ‎𞺐‎ → ‎ف‎ ) ARABIC MATHEMATICAL LOOPED FEH → ARABIC LETTER FEH # +1EEB0 ; 0641 ; MA # ( ‎𞺰‎ → ‎ف‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK FEH → ARABIC LETTER FEH # +FED3 ; 0641 ; MA # ( ‎ﻓ‎ → ‎ف‎ ) ARABIC LETTER FEH INITIAL FORM → ARABIC LETTER FEH # +FED4 ; 0641 ; MA # ( ‎ﻔ‎ → ‎ف‎ ) ARABIC LETTER FEH MEDIAL FORM → ARABIC LETTER FEH # +FED2 ; 0641 ; MA # ( ‎ﻒ‎ → ‎ف‎ ) ARABIC LETTER FEH FINAL FORM → ARABIC LETTER FEH # +FED1 ; 0641 ; MA # ( ‎ﻑ‎ → ‎ف‎ ) ARABIC LETTER FEH ISOLATED FORM → ARABIC LETTER FEH # +06A7 ; 0641 ; MA # ( ‎ڧ‎ → ‎ف‎ ) ARABIC LETTER QAF WITH DOT ABOVE → ARABIC LETTER FEH # + +FCBE ; 0641 062C ; MA # ( ‎ﲾ‎ → ‎فج‎ ) ARABIC LIGATURE FEH WITH JEEM INITIAL FORM → ARABIC LETTER FEH, ARABIC LETTER JEEM # +FC2D ; 0641 062C ; MA # ( ‎ﰭ‎ → ‎فج‎ ) ARABIC LIGATURE FEH WITH JEEM ISOLATED FORM → ARABIC LETTER FEH, ARABIC LETTER JEEM # + +FCBF ; 0641 062D ; MA # ( ‎ﲿ‎ → ‎فح‎ ) ARABIC LIGATURE FEH WITH HAH INITIAL FORM → ARABIC LETTER FEH, ARABIC LETTER HAH # +FC2E ; 0641 062D ; MA # ( ‎ﰮ‎ → ‎فح‎ ) ARABIC LIGATURE FEH WITH HAH ISOLATED FORM → ARABIC LETTER FEH, ARABIC LETTER HAH # + +FCC0 ; 0641 062E ; MA # ( ‎ﳀ‎ → ‎فخ‎ ) ARABIC LIGATURE FEH WITH KHAH INITIAL FORM → ARABIC LETTER FEH, ARABIC LETTER KHAH # +FC2F ; 0641 062E ; MA # ( ‎ﰯ‎ → ‎فخ‎ ) ARABIC LIGATURE FEH WITH KHAH ISOLATED FORM → ARABIC LETTER FEH, ARABIC LETTER KHAH # + +FD7D ; 0641 062E 0645 ; MA # ( ‎ﵽ‎ → ‎فخم‎ ) ARABIC LIGATURE FEH WITH KHAH WITH MEEM INITIAL FORM → ARABIC LETTER FEH, ARABIC LETTER KHAH, ARABIC LETTER MEEM # +FD7C ; 0641 062E 0645 ; MA # ( ‎ﵼ‎ → ‎فخم‎ ) ARABIC LIGATURE FEH WITH KHAH WITH MEEM FINAL FORM → ARABIC LETTER FEH, ARABIC LETTER KHAH, ARABIC LETTER MEEM # + +FCC1 ; 0641 0645 ; MA # ( ‎ﳁ‎ → ‎فم‎ ) ARABIC LIGATURE FEH WITH MEEM INITIAL FORM → ARABIC LETTER FEH, ARABIC LETTER MEEM # +FC30 ; 0641 0645 ; MA # ( ‎ﰰ‎ → ‎فم‎ ) ARABIC LIGATURE FEH WITH MEEM ISOLATED FORM → ARABIC LETTER FEH, ARABIC LETTER MEEM # + +FDC1 ; 0641 0645 0649 ; MA # ( ‎ﷁ‎ → ‎فمى‎ ) ARABIC LIGATURE FEH WITH MEEM WITH YEH FINAL FORM → ARABIC LETTER FEH, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # →‎فمي‎→ + +FC7C ; 0641 0649 ; MA # ( ‎ﱼ‎ → ‎فى‎ ) ARABIC LIGATURE FEH WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER FEH, ARABIC LETTER ALEF MAKSURA # +FC31 ; 0641 0649 ; MA # ( ‎ﰱ‎ → ‎فى‎ ) ARABIC LIGATURE FEH WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER FEH, ARABIC LETTER ALEF MAKSURA # +FC7D ; 0641 0649 ; MA # ( ‎ﱽ‎ → ‎فى‎ ) ARABIC LIGATURE FEH WITH YEH FINAL FORM → ARABIC LETTER FEH, ARABIC LETTER ALEF MAKSURA # →‎في‎→ +FC32 ; 0641 0649 ; MA # ( ‎ﰲ‎ → ‎فى‎ ) ARABIC LIGATURE FEH WITH YEH ISOLATED FORM → ARABIC LETTER FEH, ARABIC LETTER ALEF MAKSURA # →‎في‎→ + +1EE1E ; 06A1 ; MA # ( ‎𞸞‎ → ‎ڡ‎ ) ARABIC MATHEMATICAL DOTLESS FEH → ARABIC LETTER DOTLESS FEH # +1EE7E ; 06A1 ; MA # ( ‎𞹾‎ → ‎ڡ‎ ) ARABIC MATHEMATICAL STRETCHED DOTLESS FEH → ARABIC LETTER DOTLESS FEH # +08BB ; 06A1 ; MA # ( ‎ࢻ‎ → ‎ڡ‎ ) ARABIC LETTER AFRICAN FEH → ARABIC LETTER DOTLESS FEH # +066F ; 06A1 ; MA # ( ‎ٯ‎ → ‎ڡ‎ ) ARABIC LETTER DOTLESS QAF → ARABIC LETTER DOTLESS FEH # +1EE1F ; 06A1 ; MA # ( ‎𞸟‎ → ‎ڡ‎ ) ARABIC MATHEMATICAL DOTLESS QAF → ARABIC LETTER DOTLESS FEH # →‎ٯ‎→ +1EE5F ; 06A1 ; MA # ( ‎𞹟‎ → ‎ڡ‎ ) ARABIC MATHEMATICAL TAILED DOTLESS QAF → ARABIC LETTER DOTLESS FEH # →‎ٯ‎→ +08BC ; 06A1 ; MA # ( ‎ࢼ‎ → ‎ڡ‎ ) ARABIC LETTER AFRICAN QAF → ARABIC LETTER DOTLESS FEH # →‎ٯ‎→ + +06A4 ; 06A1 06DB ; MA # ( ‎ڤ‎ → ‎ڡۛ‎ ) ARABIC LETTER VEH → ARABIC LETTER DOTLESS FEH, ARABIC SMALL HIGH THREE DOTS # +FB6C ; 06A1 06DB ; MA # ( ‎ﭬ‎ → ‎ڡۛ‎ ) ARABIC LETTER VEH INITIAL FORM → ARABIC LETTER DOTLESS FEH, ARABIC SMALL HIGH THREE DOTS # →‎ڤ‎→ +FB6D ; 06A1 06DB ; MA # ( ‎ﭭ‎ → ‎ڡۛ‎ ) ARABIC LETTER VEH MEDIAL FORM → ARABIC LETTER DOTLESS FEH, ARABIC SMALL HIGH THREE DOTS # →‎ڤ‎→ +FB6B ; 06A1 06DB ; MA # ( ‎ﭫ‎ → ‎ڡۛ‎ ) ARABIC LETTER VEH FINAL FORM → ARABIC LETTER DOTLESS FEH, ARABIC SMALL HIGH THREE DOTS # →‎ڤ‎→ +FB6A ; 06A1 06DB ; MA # ( ‎ﭪ‎ → ‎ڡۛ‎ ) ARABIC LETTER VEH ISOLATED FORM → ARABIC LETTER DOTLESS FEH, ARABIC SMALL HIGH THREE DOTS # →‎ڤ‎→ +06A8 ; 06A1 06DB ; MA # ( ‎ڨ‎ → ‎ڡۛ‎ ) ARABIC LETTER QAF WITH THREE DOTS ABOVE → ARABIC LETTER DOTLESS FEH, ARABIC SMALL HIGH THREE DOTS # →‎ڤ‎→ + +08A4 ; 06A2 06DB ; MA # ( ‎ࢤ‎ → ‎ڢۛ‎ ) ARABIC LETTER FEH WITH DOT BELOW AND THREE DOTS ABOVE → ARABIC LETTER FEH WITH DOT MOVED BELOW, ARABIC SMALL HIGH THREE DOTS # + +FB70 ; 06A6 ; MA # ( ‎ﭰ‎ → ‎ڦ‎ ) ARABIC LETTER PEHEH INITIAL FORM → ARABIC LETTER PEHEH # +FB71 ; 06A6 ; MA # ( ‎ﭱ‎ → ‎ڦ‎ ) ARABIC LETTER PEHEH MEDIAL FORM → ARABIC LETTER PEHEH # +FB6F ; 06A6 ; MA # ( ‎ﭯ‎ → ‎ڦ‎ ) ARABIC LETTER PEHEH FINAL FORM → ARABIC LETTER PEHEH # +FB6E ; 06A6 ; MA # ( ‎ﭮ‎ → ‎ڦ‎ ) ARABIC LETTER PEHEH ISOLATED FORM → ARABIC LETTER PEHEH # + +1EE12 ; 0642 ; MA # ( ‎𞸒‎ → ‎ق‎ ) ARABIC MATHEMATICAL QAF → ARABIC LETTER QAF # +1EE32 ; 0642 ; MA # ( ‎𞸲‎ → ‎ق‎ ) ARABIC MATHEMATICAL INITIAL QAF → ARABIC LETTER QAF # +1EE52 ; 0642 ; MA # ( ‎𞹒‎ → ‎ق‎ ) ARABIC MATHEMATICAL TAILED QAF → ARABIC LETTER QAF # +1EE72 ; 0642 ; MA # ( ‎𞹲‎ → ‎ق‎ ) ARABIC MATHEMATICAL STRETCHED QAF → ARABIC LETTER QAF # +1EE92 ; 0642 ; MA # ( ‎𞺒‎ → ‎ق‎ ) ARABIC MATHEMATICAL LOOPED QAF → ARABIC LETTER QAF # +1EEB2 ; 0642 ; MA # ( ‎𞺲‎ → ‎ق‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK QAF → ARABIC LETTER QAF # +FED7 ; 0642 ; MA # ( ‎ﻗ‎ → ‎ق‎ ) ARABIC LETTER QAF INITIAL FORM → ARABIC LETTER QAF # +FED8 ; 0642 ; MA # ( ‎ﻘ‎ → ‎ق‎ ) ARABIC LETTER QAF MEDIAL FORM → ARABIC LETTER QAF # +FED6 ; 0642 ; MA # ( ‎ﻖ‎ → ‎ق‎ ) ARABIC LETTER QAF FINAL FORM → ARABIC LETTER QAF # +FED5 ; 0642 ; MA # ( ‎ﻕ‎ → ‎ق‎ ) ARABIC LETTER QAF ISOLATED FORM → ARABIC LETTER QAF # + +FCC2 ; 0642 062D ; MA # ( ‎ﳂ‎ → ‎قح‎ ) ARABIC LIGATURE QAF WITH HAH INITIAL FORM → ARABIC LETTER QAF, ARABIC LETTER HAH # +FC33 ; 0642 062D ; MA # ( ‎ﰳ‎ → ‎قح‎ ) ARABIC LIGATURE QAF WITH HAH ISOLATED FORM → ARABIC LETTER QAF, ARABIC LETTER HAH # + +FDF1 ; 0642 0644 0649 ; MA # ( ‎ﷱ‎ → ‎قلى‎ ) ARABIC LIGATURE QALA USED AS KORANIC STOP SIGN ISOLATED FORM → ARABIC LETTER QAF, ARABIC LETTER LAM, ARABIC LETTER ALEF MAKSURA # →‎قلے‎→ + +FCC3 ; 0642 0645 ; MA # ( ‎ﳃ‎ → ‎قم‎ ) ARABIC LIGATURE QAF WITH MEEM INITIAL FORM → ARABIC LETTER QAF, ARABIC LETTER MEEM # +FC34 ; 0642 0645 ; MA # ( ‎ﰴ‎ → ‎قم‎ ) ARABIC LIGATURE QAF WITH MEEM ISOLATED FORM → ARABIC LETTER QAF, ARABIC LETTER MEEM # + +FDB4 ; 0642 0645 062D ; MA # ( ‎ﶴ‎ → ‎قمح‎ ) ARABIC LIGATURE QAF WITH MEEM WITH HAH INITIAL FORM → ARABIC LETTER QAF, ARABIC LETTER MEEM, ARABIC LETTER HAH # +FD7E ; 0642 0645 062D ; MA # ( ‎ﵾ‎ → ‎قمح‎ ) ARABIC LIGATURE QAF WITH MEEM WITH HAH FINAL FORM → ARABIC LETTER QAF, ARABIC LETTER MEEM, ARABIC LETTER HAH # + +FD7F ; 0642 0645 0645 ; MA # ( ‎ﵿ‎ → ‎قمم‎ ) ARABIC LIGATURE QAF WITH MEEM WITH MEEM FINAL FORM → ARABIC LETTER QAF, ARABIC LETTER MEEM, ARABIC LETTER MEEM # + +FDB2 ; 0642 0645 0649 ; MA # ( ‎ﶲ‎ → ‎قمى‎ ) ARABIC LIGATURE QAF WITH MEEM WITH YEH FINAL FORM → ARABIC LETTER QAF, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # →‎قمي‎→ + +FC7E ; 0642 0649 ; MA # ( ‎ﱾ‎ → ‎قى‎ ) ARABIC LIGATURE QAF WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER QAF, ARABIC LETTER ALEF MAKSURA # +FC35 ; 0642 0649 ; MA # ( ‎ﰵ‎ → ‎قى‎ ) ARABIC LIGATURE QAF WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER QAF, ARABIC LETTER ALEF MAKSURA # +FC7F ; 0642 0649 ; MA # ( ‎ﱿ‎ → ‎قى‎ ) ARABIC LIGATURE QAF WITH YEH FINAL FORM → ARABIC LETTER QAF, ARABIC LETTER ALEF MAKSURA # →‎قي‎→ +FC36 ; 0642 0649 ; MA # ( ‎ﰶ‎ → ‎قى‎ ) ARABIC LIGATURE QAF WITH YEH ISOLATED FORM → ARABIC LETTER QAF, ARABIC LETTER ALEF MAKSURA # →‎قي‎→ + +1EE0A ; 0643 ; MA # ( ‎𞸊‎ → ‎ك‎ ) ARABIC MATHEMATICAL KAF → ARABIC LETTER KAF # +1EE2A ; 0643 ; MA # ( ‎𞸪‎ → ‎ك‎ ) ARABIC MATHEMATICAL INITIAL KAF → ARABIC LETTER KAF # +1EE6A ; 0643 ; MA # ( ‎𞹪‎ → ‎ك‎ ) ARABIC MATHEMATICAL STRETCHED KAF → ARABIC LETTER KAF # +FEDB ; 0643 ; MA # ( ‎ﻛ‎ → ‎ك‎ ) ARABIC LETTER KAF INITIAL FORM → ARABIC LETTER KAF # +FEDC ; 0643 ; MA # ( ‎ﻜ‎ → ‎ك‎ ) ARABIC LETTER KAF MEDIAL FORM → ARABIC LETTER KAF # +FEDA ; 0643 ; MA # ( ‎ﻚ‎ → ‎ك‎ ) ARABIC LETTER KAF FINAL FORM → ARABIC LETTER KAF # +FED9 ; 0643 ; MA # ( ‎ﻙ‎ → ‎ك‎ ) ARABIC LETTER KAF ISOLATED FORM → ARABIC LETTER KAF # +06A9 ; 0643 ; MA # ( ‎ک‎ → ‎ك‎ ) ARABIC LETTER KEHEH → ARABIC LETTER KAF # +FB90 ; 0643 ; MA # ( ‎ﮐ‎ → ‎ك‎ ) ARABIC LETTER KEHEH INITIAL FORM → ARABIC LETTER KAF # →‎ک‎→ +FB91 ; 0643 ; MA # ( ‎ﮑ‎ → ‎ك‎ ) ARABIC LETTER KEHEH MEDIAL FORM → ARABIC LETTER KAF # →‎ک‎→ +FB8F ; 0643 ; MA # ( ‎ﮏ‎ → ‎ك‎ ) ARABIC LETTER KEHEH FINAL FORM → ARABIC LETTER KAF # →‎ک‎→ +FB8E ; 0643 ; MA # ( ‎ﮎ‎ → ‎ك‎ ) ARABIC LETTER KEHEH ISOLATED FORM → ARABIC LETTER KAF # →‎ک‎→ +06AA ; 0643 ; MA # ( ‎ڪ‎ → ‎ك‎ ) ARABIC LETTER SWASH KAF → ARABIC LETTER KAF # + +06AD ; 0643 06DB ; MA # ( ‎ڭ‎ → ‎كۛ‎ ) ARABIC LETTER NG → ARABIC LETTER KAF, ARABIC SMALL HIGH THREE DOTS # +FBD5 ; 0643 06DB ; MA # ( ‎ﯕ‎ → ‎كۛ‎ ) ARABIC LETTER NG INITIAL FORM → ARABIC LETTER KAF, ARABIC SMALL HIGH THREE DOTS # →‎ڭ‎→ +FBD6 ; 0643 06DB ; MA # ( ‎ﯖ‎ → ‎كۛ‎ ) ARABIC LETTER NG MEDIAL FORM → ARABIC LETTER KAF, ARABIC SMALL HIGH THREE DOTS # →‎ڭ‎→ +FBD4 ; 0643 06DB ; MA # ( ‎ﯔ‎ → ‎كۛ‎ ) ARABIC LETTER NG FINAL FORM → ARABIC LETTER KAF, ARABIC SMALL HIGH THREE DOTS # →‎ڭ‎→ +FBD3 ; 0643 06DB ; MA # ( ‎ﯓ‎ → ‎كۛ‎ ) ARABIC LETTER NG ISOLATED FORM → ARABIC LETTER KAF, ARABIC SMALL HIGH THREE DOTS # →‎ڭ‎→ +0763 ; 0643 06DB ; MA # ( ‎ݣ‎ → ‎كۛ‎ ) ARABIC LETTER KEHEH WITH THREE DOTS ABOVE → ARABIC LETTER KAF, ARABIC SMALL HIGH THREE DOTS # →‎ڭ‎→ + +08C2 ; 0643 0306 ; MA # ( ‎ࣂ‎ → ‎ك̆‎ ) ARABIC LETTER KEHEH WITH SMALL V → ARABIC LETTER KAF, COMBINING BREVE # →‎کٚ‎→ + +FC80 ; 0643 006C ; MA # ( ‎ﲀ‎ → ‎كl‎ ) ARABIC LIGATURE KAF WITH ALEF FINAL FORM → ARABIC LETTER KAF, LATIN SMALL LETTER L # →‎كا‎→ +FC37 ; 0643 006C ; MA # ( ‎ﰷ‎ → ‎كl‎ ) ARABIC LIGATURE KAF WITH ALEF ISOLATED FORM → ARABIC LETTER KAF, LATIN SMALL LETTER L # →‎كا‎→ + +FCC4 ; 0643 062C ; MA # ( ‎ﳄ‎ → ‎كج‎ ) ARABIC LIGATURE KAF WITH JEEM INITIAL FORM → ARABIC LETTER KAF, ARABIC LETTER JEEM # +FC38 ; 0643 062C ; MA # ( ‎ﰸ‎ → ‎كج‎ ) ARABIC LIGATURE KAF WITH JEEM ISOLATED FORM → ARABIC LETTER KAF, ARABIC LETTER JEEM # + +FCC5 ; 0643 062D ; MA # ( ‎ﳅ‎ → ‎كح‎ ) ARABIC LIGATURE KAF WITH HAH INITIAL FORM → ARABIC LETTER KAF, ARABIC LETTER HAH # +FC39 ; 0643 062D ; MA # ( ‎ﰹ‎ → ‎كح‎ ) ARABIC LIGATURE KAF WITH HAH ISOLATED FORM → ARABIC LETTER KAF, ARABIC LETTER HAH # + +FCC6 ; 0643 062E ; MA # ( ‎ﳆ‎ → ‎كخ‎ ) ARABIC LIGATURE KAF WITH KHAH INITIAL FORM → ARABIC LETTER KAF, ARABIC LETTER KHAH # +FC3A ; 0643 062E ; MA # ( ‎ﰺ‎ → ‎كخ‎ ) ARABIC LIGATURE KAF WITH KHAH ISOLATED FORM → ARABIC LETTER KAF, ARABIC LETTER KHAH # + +FCC7 ; 0643 0644 ; MA # ( ‎ﳇ‎ → ‎كل‎ ) ARABIC LIGATURE KAF WITH LAM INITIAL FORM → ARABIC LETTER KAF, ARABIC LETTER LAM # +FCEB ; 0643 0644 ; MA # ( ‎ﳫ‎ → ‎كل‎ ) ARABIC LIGATURE KAF WITH LAM MEDIAL FORM → ARABIC LETTER KAF, ARABIC LETTER LAM # +FC81 ; 0643 0644 ; MA # ( ‎ﲁ‎ → ‎كل‎ ) ARABIC LIGATURE KAF WITH LAM FINAL FORM → ARABIC LETTER KAF, ARABIC LETTER LAM # +FC3B ; 0643 0644 ; MA # ( ‎ﰻ‎ → ‎كل‎ ) ARABIC LIGATURE KAF WITH LAM ISOLATED FORM → ARABIC LETTER KAF, ARABIC LETTER LAM # + +FCC8 ; 0643 0645 ; MA # ( ‎ﳈ‎ → ‎كم‎ ) ARABIC LIGATURE KAF WITH MEEM INITIAL FORM → ARABIC LETTER KAF, ARABIC LETTER MEEM # +FCEC ; 0643 0645 ; MA # ( ‎ﳬ‎ → ‎كم‎ ) ARABIC LIGATURE KAF WITH MEEM MEDIAL FORM → ARABIC LETTER KAF, ARABIC LETTER MEEM # +FC82 ; 0643 0645 ; MA # ( ‎ﲂ‎ → ‎كم‎ ) ARABIC LIGATURE KAF WITH MEEM FINAL FORM → ARABIC LETTER KAF, ARABIC LETTER MEEM # +FC3C ; 0643 0645 ; MA # ( ‎ﰼ‎ → ‎كم‎ ) ARABIC LIGATURE KAF WITH MEEM ISOLATED FORM → ARABIC LETTER KAF, ARABIC LETTER MEEM # + +FDC3 ; 0643 0645 0645 ; MA # ( ‎ﷃ‎ → ‎كمم‎ ) ARABIC LIGATURE KAF WITH MEEM WITH MEEM INITIAL FORM → ARABIC LETTER KAF, ARABIC LETTER MEEM, ARABIC LETTER MEEM # +FDBB ; 0643 0645 0645 ; MA # ( ‎ﶻ‎ → ‎كمم‎ ) ARABIC LIGATURE KAF WITH MEEM WITH MEEM FINAL FORM → ARABIC LETTER KAF, ARABIC LETTER MEEM, ARABIC LETTER MEEM # + +FDB7 ; 0643 0645 0649 ; MA # ( ‎ﶷ‎ → ‎كمى‎ ) ARABIC LIGATURE KAF WITH MEEM WITH YEH FINAL FORM → ARABIC LETTER KAF, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # →‎كمي‎→ + +FC83 ; 0643 0649 ; MA # ( ‎ﲃ‎ → ‎كى‎ ) ARABIC LIGATURE KAF WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER KAF, ARABIC LETTER ALEF MAKSURA # +FC3D ; 0643 0649 ; MA # ( ‎ﰽ‎ → ‎كى‎ ) ARABIC LIGATURE KAF WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER KAF, ARABIC LETTER ALEF MAKSURA # +FC84 ; 0643 0649 ; MA # ( ‎ﲄ‎ → ‎كى‎ ) ARABIC LIGATURE KAF WITH YEH FINAL FORM → ARABIC LETTER KAF, ARABIC LETTER ALEF MAKSURA # →‎كي‎→ +FC3E ; 0643 0649 ; MA # ( ‎ﰾ‎ → ‎كى‎ ) ARABIC LIGATURE KAF WITH YEH ISOLATED FORM → ARABIC LETTER KAF, ARABIC LETTER ALEF MAKSURA # →‎كي‎→ + +0762 ; 06AC ; MA # ( ‎ݢ‎ → ‎ڬ‎ ) ARABIC LETTER KEHEH WITH DOT ABOVE → ARABIC LETTER KAF WITH DOT ABOVE # + +FB94 ; 06AF ; MA # ( ‎ﮔ‎ → ‎گ‎ ) ARABIC LETTER GAF INITIAL FORM → ARABIC LETTER GAF # +FB95 ; 06AF ; MA # ( ‎ﮕ‎ → ‎گ‎ ) ARABIC LETTER GAF MEDIAL FORM → ARABIC LETTER GAF # +FB93 ; 06AF ; MA # ( ‎ﮓ‎ → ‎گ‎ ) ARABIC LETTER GAF FINAL FORM → ARABIC LETTER GAF # +FB92 ; 06AF ; MA # ( ‎ﮒ‎ → ‎گ‎ ) ARABIC LETTER GAF ISOLATED FORM → ARABIC LETTER GAF # +08B0 ; 06AF ; MA # ( ‎ࢰ‎ → ‎گ‎ ) ARABIC LETTER GAF WITH INVERTED STROKE → ARABIC LETTER GAF # + +06B4 ; 06AF 06DB ; MA # ( ‎ڴ‎ → ‎گۛ‎ ) ARABIC LETTER GAF WITH THREE DOTS ABOVE → ARABIC LETTER GAF, ARABIC SMALL HIGH THREE DOTS # + +FB9C ; 06B1 ; MA # ( ‎ﮜ‎ → ‎ڱ‎ ) ARABIC LETTER NGOEH INITIAL FORM → ARABIC LETTER NGOEH # +FB9D ; 06B1 ; MA # ( ‎ﮝ‎ → ‎ڱ‎ ) ARABIC LETTER NGOEH MEDIAL FORM → ARABIC LETTER NGOEH # +FB9B ; 06B1 ; MA # ( ‎ﮛ‎ → ‎ڱ‎ ) ARABIC LETTER NGOEH FINAL FORM → ARABIC LETTER NGOEH # +FB9A ; 06B1 ; MA # ( ‎ﮚ‎ → ‎ڱ‎ ) ARABIC LETTER NGOEH ISOLATED FORM → ARABIC LETTER NGOEH # + +FB98 ; 06B3 ; MA # ( ‎ﮘ‎ → ‎ڳ‎ ) ARABIC LETTER GUEH INITIAL FORM → ARABIC LETTER GUEH # +FB99 ; 06B3 ; MA # ( ‎ﮙ‎ → ‎ڳ‎ ) ARABIC LETTER GUEH MEDIAL FORM → ARABIC LETTER GUEH # +FB97 ; 06B3 ; MA # ( ‎ﮗ‎ → ‎ڳ‎ ) ARABIC LETTER GUEH FINAL FORM → ARABIC LETTER GUEH # +FB96 ; 06B3 ; MA # ( ‎ﮖ‎ → ‎ڳ‎ ) ARABIC LETTER GUEH ISOLATED FORM → ARABIC LETTER GUEH # + +1EE0B ; 0644 ; MA # ( ‎𞸋‎ → ‎ل‎ ) ARABIC MATHEMATICAL LAM → ARABIC LETTER LAM # +1EE2B ; 0644 ; MA # ( ‎𞸫‎ → ‎ل‎ ) ARABIC MATHEMATICAL INITIAL LAM → ARABIC LETTER LAM # +1EE4B ; 0644 ; MA # ( ‎𞹋‎ → ‎ل‎ ) ARABIC MATHEMATICAL TAILED LAM → ARABIC LETTER LAM # +1EE8B ; 0644 ; MA # ( ‎𞺋‎ → ‎ل‎ ) ARABIC MATHEMATICAL LOOPED LAM → ARABIC LETTER LAM # +1EEAB ; 0644 ; MA # ( ‎𞺫‎ → ‎ل‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK LAM → ARABIC LETTER LAM # +FEDF ; 0644 ; MA # ( ‎ﻟ‎ → ‎ل‎ ) ARABIC LETTER LAM INITIAL FORM → ARABIC LETTER LAM # +FEE0 ; 0644 ; MA # ( ‎ﻠ‎ → ‎ل‎ ) ARABIC LETTER LAM MEDIAL FORM → ARABIC LETTER LAM # +FEDE ; 0644 ; MA # ( ‎ﻞ‎ → ‎ل‎ ) ARABIC LETTER LAM FINAL FORM → ARABIC LETTER LAM # +FEDD ; 0644 ; MA # ( ‎ﻝ‎ → ‎ل‎ ) ARABIC LETTER LAM ISOLATED FORM → ARABIC LETTER LAM # + +06B7 ; 0644 06DB ; MA # ( ‎ڷ‎ → ‎لۛ‎ ) ARABIC LETTER LAM WITH THREE DOTS ABOVE → ARABIC LETTER LAM, ARABIC SMALL HIGH THREE DOTS # + +06B5 ; 0644 0306 ; MA # ( ‎ڵ‎ → ‎ل̆‎ ) ARABIC LETTER LAM WITH SMALL V → ARABIC LETTER LAM, COMBINING BREVE # →‎لٚ‎→ + +FEFC ; 0644 006C ; MA # ( ‎ﻼ‎ → ‎لl‎ ) ARABIC LIGATURE LAM WITH ALEF FINAL FORM → ARABIC LETTER LAM, LATIN SMALL LETTER L # →‎لا‎→ +FEFB ; 0644 006C ; MA # ( ‎ﻻ‎ → ‎لl‎ ) ARABIC LIGATURE LAM WITH ALEF ISOLATED FORM → ARABIC LETTER LAM, LATIN SMALL LETTER L # →‎لا‎→ + +FEFA ; 0644 006C 0655 ; MA # ( ‎ﻺ‎ → ‎لlٕ‎ ) ARABIC LIGATURE LAM WITH ALEF WITH HAMZA BELOW FINAL FORM → ARABIC LETTER LAM, LATIN SMALL LETTER L, ARABIC HAMZA BELOW # →‎لإ‎→ +FEF9 ; 0644 006C 0655 ; MA # ( ‎ﻹ‎ → ‎لlٕ‎ ) ARABIC LIGATURE LAM WITH ALEF WITH HAMZA BELOW ISOLATED FORM → ARABIC LETTER LAM, LATIN SMALL LETTER L, ARABIC HAMZA BELOW # →‎لإ‎→ + +FEF8 ; 0644 006C 0674 ; MA # ( ‎ﻸ‎ → ‎لlٴ‎ ) ARABIC LIGATURE LAM WITH ALEF WITH HAMZA ABOVE FINAL FORM → ARABIC LETTER LAM, LATIN SMALL LETTER L, ARABIC LETTER HIGH HAMZA # →‎لأ‎→ +FEF7 ; 0644 006C 0674 ; MA # ( ‎ﻷ‎ → ‎لlٴ‎ ) ARABIC LIGATURE LAM WITH ALEF WITH HAMZA ABOVE ISOLATED FORM → ARABIC LETTER LAM, LATIN SMALL LETTER L, ARABIC LETTER HIGH HAMZA # →‎لأ‎→ + +FCCD ; 0644 006F ; MA # ( ‎ﳍ‎ → ‎لo‎ ) ARABIC LIGATURE LAM WITH HEH INITIAL FORM → ARABIC LETTER LAM, LATIN SMALL LETTER O # →‎له‎→ + +FEF6 ; 0644 0622 ; MA # ( ‎ﻶ‎ → ‎لآ‎ ) ARABIC LIGATURE LAM WITH ALEF WITH MADDA ABOVE FINAL FORM → ARABIC LETTER LAM, ARABIC LETTER ALEF WITH MADDA ABOVE # +FEF5 ; 0644 0622 ; MA # ( ‎ﻵ‎ → ‎لآ‎ ) ARABIC LIGATURE LAM WITH ALEF WITH MADDA ABOVE ISOLATED FORM → ARABIC LETTER LAM, ARABIC LETTER ALEF WITH MADDA ABOVE # + +FCC9 ; 0644 062C ; MA # ( ‎ﳉ‎ → ‎لج‎ ) ARABIC LIGATURE LAM WITH JEEM INITIAL FORM → ARABIC LETTER LAM, ARABIC LETTER JEEM # +FC3F ; 0644 062C ; MA # ( ‎ﰿ‎ → ‎لج‎ ) ARABIC LIGATURE LAM WITH JEEM ISOLATED FORM → ARABIC LETTER LAM, ARABIC LETTER JEEM # + +FD83 ; 0644 062C 062C ; MA # ( ‎ﶃ‎ → ‎لجج‎ ) ARABIC LIGATURE LAM WITH JEEM WITH JEEM INITIAL FORM → ARABIC LETTER LAM, ARABIC LETTER JEEM, ARABIC LETTER JEEM # +FD84 ; 0644 062C 062C ; MA # ( ‎ﶄ‎ → ‎لجج‎ ) ARABIC LIGATURE LAM WITH JEEM WITH JEEM FINAL FORM → ARABIC LETTER LAM, ARABIC LETTER JEEM, ARABIC LETTER JEEM # + +FDBA ; 0644 062C 0645 ; MA # ( ‎ﶺ‎ → ‎لجم‎ ) ARABIC LIGATURE LAM WITH JEEM WITH MEEM INITIAL FORM → ARABIC LETTER LAM, ARABIC LETTER JEEM, ARABIC LETTER MEEM # +FDBC ; 0644 062C 0645 ; MA # ( ‎ﶼ‎ → ‎لجم‎ ) ARABIC LIGATURE LAM WITH JEEM WITH MEEM FINAL FORM → ARABIC LETTER LAM, ARABIC LETTER JEEM, ARABIC LETTER MEEM # + +FDAC ; 0644 062C 0649 ; MA # ( ‎ﶬ‎ → ‎لجى‎ ) ARABIC LIGATURE LAM WITH JEEM WITH YEH FINAL FORM → ARABIC LETTER LAM, ARABIC LETTER JEEM, ARABIC LETTER ALEF MAKSURA # →‎لجي‎→ + +FCCA ; 0644 062D ; MA # ( ‎ﳊ‎ → ‎لح‎ ) ARABIC LIGATURE LAM WITH HAH INITIAL FORM → ARABIC LETTER LAM, ARABIC LETTER HAH # +FC40 ; 0644 062D ; MA # ( ‎ﱀ‎ → ‎لح‎ ) ARABIC LIGATURE LAM WITH HAH ISOLATED FORM → ARABIC LETTER LAM, ARABIC LETTER HAH # + +FDB5 ; 0644 062D 0645 ; MA # ( ‎ﶵ‎ → ‎لحم‎ ) ARABIC LIGATURE LAM WITH HAH WITH MEEM INITIAL FORM → ARABIC LETTER LAM, ARABIC LETTER HAH, ARABIC LETTER MEEM # +FD80 ; 0644 062D 0645 ; MA # ( ‎ﶀ‎ → ‎لحم‎ ) ARABIC LIGATURE LAM WITH HAH WITH MEEM FINAL FORM → ARABIC LETTER LAM, ARABIC LETTER HAH, ARABIC LETTER MEEM # + +FD82 ; 0644 062D 0649 ; MA # ( ‎ﶂ‎ → ‎لحى‎ ) ARABIC LIGATURE LAM WITH HAH WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER LAM, ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # +FD81 ; 0644 062D 0649 ; MA # ( ‎ﶁ‎ → ‎لحى‎ ) ARABIC LIGATURE LAM WITH HAH WITH YEH FINAL FORM → ARABIC LETTER LAM, ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # →‎لحي‎→ + +FCCB ; 0644 062E ; MA # ( ‎ﳋ‎ → ‎لخ‎ ) ARABIC LIGATURE LAM WITH KHAH INITIAL FORM → ARABIC LETTER LAM, ARABIC LETTER KHAH # +FC41 ; 0644 062E ; MA # ( ‎ﱁ‎ → ‎لخ‎ ) ARABIC LIGATURE LAM WITH KHAH ISOLATED FORM → ARABIC LETTER LAM, ARABIC LETTER KHAH # + +FD86 ; 0644 062E 0645 ; MA # ( ‎ﶆ‎ → ‎لخم‎ ) ARABIC LIGATURE LAM WITH KHAH WITH MEEM INITIAL FORM → ARABIC LETTER LAM, ARABIC LETTER KHAH, ARABIC LETTER MEEM # +FD85 ; 0644 062E 0645 ; MA # ( ‎ﶅ‎ → ‎لخم‎ ) ARABIC LIGATURE LAM WITH KHAH WITH MEEM FINAL FORM → ARABIC LETTER LAM, ARABIC LETTER KHAH, ARABIC LETTER MEEM # + +FCCC ; 0644 0645 ; MA # ( ‎ﳌ‎ → ‎لم‎ ) ARABIC LIGATURE LAM WITH MEEM INITIAL FORM → ARABIC LETTER LAM, ARABIC LETTER MEEM # +FCED ; 0644 0645 ; MA # ( ‎ﳭ‎ → ‎لم‎ ) ARABIC LIGATURE LAM WITH MEEM MEDIAL FORM → ARABIC LETTER LAM, ARABIC LETTER MEEM # +FC85 ; 0644 0645 ; MA # ( ‎ﲅ‎ → ‎لم‎ ) ARABIC LIGATURE LAM WITH MEEM FINAL FORM → ARABIC LETTER LAM, ARABIC LETTER MEEM # +FC42 ; 0644 0645 ; MA # ( ‎ﱂ‎ → ‎لم‎ ) ARABIC LIGATURE LAM WITH MEEM ISOLATED FORM → ARABIC LETTER LAM, ARABIC LETTER MEEM # + +FD88 ; 0644 0645 062D ; MA # ( ‎ﶈ‎ → ‎لمح‎ ) ARABIC LIGATURE LAM WITH MEEM WITH HAH INITIAL FORM → ARABIC LETTER LAM, ARABIC LETTER MEEM, ARABIC LETTER HAH # +FD87 ; 0644 0645 062D ; MA # ( ‎ﶇ‎ → ‎لمح‎ ) ARABIC LIGATURE LAM WITH MEEM WITH HAH FINAL FORM → ARABIC LETTER LAM, ARABIC LETTER MEEM, ARABIC LETTER HAH # + +FDAD ; 0644 0645 0649 ; MA # ( ‎ﶭ‎ → ‎لمى‎ ) ARABIC LIGATURE LAM WITH MEEM WITH YEH FINAL FORM → ARABIC LETTER LAM, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # →‎لمي‎→ + +FC86 ; 0644 0649 ; MA # ( ‎ﲆ‎ → ‎لى‎ ) ARABIC LIGATURE LAM WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER LAM, ARABIC LETTER ALEF MAKSURA # +FC43 ; 0644 0649 ; MA # ( ‎ﱃ‎ → ‎لى‎ ) ARABIC LIGATURE LAM WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER LAM, ARABIC LETTER ALEF MAKSURA # +FC87 ; 0644 0649 ; MA # ( ‎ﲇ‎ → ‎لى‎ ) ARABIC LIGATURE LAM WITH YEH FINAL FORM → ARABIC LETTER LAM, ARABIC LETTER ALEF MAKSURA # →‎لي‎→ +FC44 ; 0644 0649 ; MA # ( ‎ﱄ‎ → ‎لى‎ ) ARABIC LIGATURE LAM WITH YEH ISOLATED FORM → ARABIC LETTER LAM, ARABIC LETTER ALEF MAKSURA # →‎لي‎→ + +1EE0C ; 0645 ; MA # ( ‎𞸌‎ → ‎م‎ ) ARABIC MATHEMATICAL MEEM → ARABIC LETTER MEEM # +1EE2C ; 0645 ; MA # ( ‎𞸬‎ → ‎م‎ ) ARABIC MATHEMATICAL INITIAL MEEM → ARABIC LETTER MEEM # +1EE6C ; 0645 ; MA # ( ‎𞹬‎ → ‎م‎ ) ARABIC MATHEMATICAL STRETCHED MEEM → ARABIC LETTER MEEM # +1EE8C ; 0645 ; MA # ( ‎𞺌‎ → ‎م‎ ) ARABIC MATHEMATICAL LOOPED MEEM → ARABIC LETTER MEEM # +1EEAC ; 0645 ; MA # ( ‎𞺬‎ → ‎م‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK MEEM → ARABIC LETTER MEEM # +FEE3 ; 0645 ; MA # ( ‎ﻣ‎ → ‎م‎ ) ARABIC LETTER MEEM INITIAL FORM → ARABIC LETTER MEEM # +FEE4 ; 0645 ; MA # ( ‎ﻤ‎ → ‎م‎ ) ARABIC LETTER MEEM MEDIAL FORM → ARABIC LETTER MEEM # +FEE2 ; 0645 ; MA # ( ‎ﻢ‎ → ‎م‎ ) ARABIC LETTER MEEM FINAL FORM → ARABIC LETTER MEEM # +FEE1 ; 0645 ; MA # ( ‎ﻡ‎ → ‎م‎ ) ARABIC LETTER MEEM ISOLATED FORM → ARABIC LETTER MEEM # + +08A7 ; 0645 06DB ; MA # ( ‎ࢧ‎ → ‎مۛ‎ ) ARABIC LETTER MEEM WITH THREE DOTS ABOVE → ARABIC LETTER MEEM, ARABIC SMALL HIGH THREE DOTS # + +FC88 ; 0645 006C ; MA # ( ‎ﲈ‎ → ‎مl‎ ) ARABIC LIGATURE MEEM WITH ALEF FINAL FORM → ARABIC LETTER MEEM, LATIN SMALL LETTER L # →‎ما‎→ + +FCCE ; 0645 062C ; MA # ( ‎ﳎ‎ → ‎مج‎ ) ARABIC LIGATURE MEEM WITH JEEM INITIAL FORM → ARABIC LETTER MEEM, ARABIC LETTER JEEM # +FC45 ; 0645 062C ; MA # ( ‎ﱅ‎ → ‎مج‎ ) ARABIC LIGATURE MEEM WITH JEEM ISOLATED FORM → ARABIC LETTER MEEM, ARABIC LETTER JEEM # + +FD8C ; 0645 062C 062D ; MA # ( ‎ﶌ‎ → ‎مجح‎ ) ARABIC LIGATURE MEEM WITH JEEM WITH HAH INITIAL FORM → ARABIC LETTER MEEM, ARABIC LETTER JEEM, ARABIC LETTER HAH # + +FD92 ; 0645 062C 062E ; MA # ( ‎ﶒ‎ → ‎مجخ‎ ) ARABIC LIGATURE MEEM WITH JEEM WITH KHAH INITIAL FORM → ARABIC LETTER MEEM, ARABIC LETTER JEEM, ARABIC LETTER KHAH # + +FD8D ; 0645 062C 0645 ; MA # ( ‎ﶍ‎ → ‎مجم‎ ) ARABIC LIGATURE MEEM WITH JEEM WITH MEEM INITIAL FORM → ARABIC LETTER MEEM, ARABIC LETTER JEEM, ARABIC LETTER MEEM # + +FDC0 ; 0645 062C 0649 ; MA # ( ‎ﷀ‎ → ‎مجى‎ ) ARABIC LIGATURE MEEM WITH JEEM WITH YEH FINAL FORM → ARABIC LETTER MEEM, ARABIC LETTER JEEM, ARABIC LETTER ALEF MAKSURA # →‎مجي‎→ + +FCCF ; 0645 062D ; MA # ( ‎ﳏ‎ → ‎مح‎ ) ARABIC LIGATURE MEEM WITH HAH INITIAL FORM → ARABIC LETTER MEEM, ARABIC LETTER HAH # +FC46 ; 0645 062D ; MA # ( ‎ﱆ‎ → ‎مح‎ ) ARABIC LIGATURE MEEM WITH HAH ISOLATED FORM → ARABIC LETTER MEEM, ARABIC LETTER HAH # + +FD89 ; 0645 062D 062C ; MA # ( ‎ﶉ‎ → ‎محج‎ ) ARABIC LIGATURE MEEM WITH HAH WITH JEEM INITIAL FORM → ARABIC LETTER MEEM, ARABIC LETTER HAH, ARABIC LETTER JEEM # + +FD8A ; 0645 062D 0645 ; MA # ( ‎ﶊ‎ → ‎محم‎ ) ARABIC LIGATURE MEEM WITH HAH WITH MEEM INITIAL FORM → ARABIC LETTER MEEM, ARABIC LETTER HAH, ARABIC LETTER MEEM # + +FDF4 ; 0645 062D 0645 062F ; MA # ( ‎ﷴ‎ → ‎محمد‎ ) ARABIC LIGATURE MOHAMMAD ISOLATED FORM → ARABIC LETTER MEEM, ARABIC LETTER HAH, ARABIC LETTER MEEM, ARABIC LETTER DAL # + +FD8B ; 0645 062D 0649 ; MA # ( ‎ﶋ‎ → ‎محى‎ ) ARABIC LIGATURE MEEM WITH HAH WITH YEH FINAL FORM → ARABIC LETTER MEEM, ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # →‎محي‎→ + +FCD0 ; 0645 062E ; MA # ( ‎ﳐ‎ → ‎مخ‎ ) ARABIC LIGATURE MEEM WITH KHAH INITIAL FORM → ARABIC LETTER MEEM, ARABIC LETTER KHAH # +FC47 ; 0645 062E ; MA # ( ‎ﱇ‎ → ‎مخ‎ ) ARABIC LIGATURE MEEM WITH KHAH ISOLATED FORM → ARABIC LETTER MEEM, ARABIC LETTER KHAH # + +FD8E ; 0645 062E 062C ; MA # ( ‎ﶎ‎ → ‎مخج‎ ) ARABIC LIGATURE MEEM WITH KHAH WITH JEEM INITIAL FORM → ARABIC LETTER MEEM, ARABIC LETTER KHAH, ARABIC LETTER JEEM # + +FD8F ; 0645 062E 0645 ; MA # ( ‎ﶏ‎ → ‎مخم‎ ) ARABIC LIGATURE MEEM WITH KHAH WITH MEEM INITIAL FORM → ARABIC LETTER MEEM, ARABIC LETTER KHAH, ARABIC LETTER MEEM # + +FDB9 ; 0645 062E 0649 ; MA # ( ‎ﶹ‎ → ‎مخى‎ ) ARABIC LIGATURE MEEM WITH KHAH WITH YEH FINAL FORM → ARABIC LETTER MEEM, ARABIC LETTER KHAH, ARABIC LETTER ALEF MAKSURA # →‎مخي‎→ + +FCD1 ; 0645 0645 ; MA # ( ‎ﳑ‎ → ‎مم‎ ) ARABIC LIGATURE MEEM WITH MEEM INITIAL FORM → ARABIC LETTER MEEM, ARABIC LETTER MEEM # +FC89 ; 0645 0645 ; MA # ( ‎ﲉ‎ → ‎مم‎ ) ARABIC LIGATURE MEEM WITH MEEM FINAL FORM → ARABIC LETTER MEEM, ARABIC LETTER MEEM # +FC48 ; 0645 0645 ; MA # ( ‎ﱈ‎ → ‎مم‎ ) ARABIC LIGATURE MEEM WITH MEEM ISOLATED FORM → ARABIC LETTER MEEM, ARABIC LETTER MEEM # + +FDB1 ; 0645 0645 0649 ; MA # ( ‎ﶱ‎ → ‎ممى‎ ) ARABIC LIGATURE MEEM WITH MEEM WITH YEH FINAL FORM → ARABIC LETTER MEEM, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # →‎ممي‎→ + +FC49 ; 0645 0649 ; MA # ( ‎ﱉ‎ → ‎مى‎ ) ARABIC LIGATURE MEEM WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # +FC4A ; 0645 0649 ; MA # ( ‎ﱊ‎ → ‎مى‎ ) ARABIC LIGATURE MEEM WITH YEH ISOLATED FORM → ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # →‎مي‎→ + +06FE ; 0645 10EFA ; MA #* ( ‎۾‎ → ‎م𐻺‎ ) ARABIC SIGN SINDHI POSTPOSITION MEN → ARABIC LETTER MEEM, ARABIC DOUBLE VERTICAL BAR BELOW # + +1EE0D ; 0646 ; MA # ( ‎𞸍‎ → ‎ن‎ ) ARABIC MATHEMATICAL NOON → ARABIC LETTER NOON # +1EE2D ; 0646 ; MA # ( ‎𞸭‎ → ‎ن‎ ) ARABIC MATHEMATICAL INITIAL NOON → ARABIC LETTER NOON # +1EE4D ; 0646 ; MA # ( ‎𞹍‎ → ‎ن‎ ) ARABIC MATHEMATICAL TAILED NOON → ARABIC LETTER NOON # +1EE6D ; 0646 ; MA # ( ‎𞹭‎ → ‎ن‎ ) ARABIC MATHEMATICAL STRETCHED NOON → ARABIC LETTER NOON # +1EE8D ; 0646 ; MA # ( ‎𞺍‎ → ‎ن‎ ) ARABIC MATHEMATICAL LOOPED NOON → ARABIC LETTER NOON # +1EEAD ; 0646 ; MA # ( ‎𞺭‎ → ‎ن‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK NOON → ARABIC LETTER NOON # +FEE7 ; 0646 ; MA # ( ‎ﻧ‎ → ‎ن‎ ) ARABIC LETTER NOON INITIAL FORM → ARABIC LETTER NOON # +FEE8 ; 0646 ; MA # ( ‎ﻨ‎ → ‎ن‎ ) ARABIC LETTER NOON MEDIAL FORM → ARABIC LETTER NOON # +FEE6 ; 0646 ; MA # ( ‎ﻦ‎ → ‎ن‎ ) ARABIC LETTER NOON FINAL FORM → ARABIC LETTER NOON # +FEE5 ; 0646 ; MA # ( ‎ﻥ‎ → ‎ن‎ ) ARABIC LETTER NOON ISOLATED FORM → ARABIC LETTER NOON # +10EC6 ; 0646 ; MA # ( ‎𐻆‎ → ‎ن‎ ) ARABIC LETTER THIN NOON → ARABIC LETTER NOON # + +0768 ; 0646 0615 ; MA # ( ‎ݨ‎ → ‎نؕ‎ ) ARABIC LETTER NOON WITH SMALL TAH → ARABIC LETTER NOON, ARABIC SMALL HIGH TAH # + +0769 ; 0646 0306 ; MA # ( ‎ݩ‎ → ‎ن̆‎ ) ARABIC LETTER NOON WITH SMALL V → ARABIC LETTER NOON, COMBINING BREVE # →‎نٚ‎→ + +FCD6 ; 0646 006F ; MA # ( ‎ﳖ‎ → ‎نo‎ ) ARABIC LIGATURE NOON WITH HEH INITIAL FORM → ARABIC LETTER NOON, LATIN SMALL LETTER O # →‎نه‎→ +FCEF ; 0646 006F ; MA # ( ‎ﳯ‎ → ‎نo‎ ) ARABIC LIGATURE NOON WITH HEH MEDIAL FORM → ARABIC LETTER NOON, LATIN SMALL LETTER O # →‎نه‎→ + +FDB8 ; 0646 062C 062D ; MA # ( ‎ﶸ‎ → ‎نجح‎ ) ARABIC LIGATURE NOON WITH JEEM WITH HAH INITIAL FORM → ARABIC LETTER NOON, ARABIC LETTER JEEM, ARABIC LETTER HAH # +FDBD ; 0646 062C 062D ; MA # ( ‎ﶽ‎ → ‎نجح‎ ) ARABIC LIGATURE NOON WITH JEEM WITH HAH FINAL FORM → ARABIC LETTER NOON, ARABIC LETTER JEEM, ARABIC LETTER HAH # + +FD98 ; 0646 062C 0645 ; MA # ( ‎ﶘ‎ → ‎نجم‎ ) ARABIC LIGATURE NOON WITH JEEM WITH MEEM INITIAL FORM → ARABIC LETTER NOON, ARABIC LETTER JEEM, ARABIC LETTER MEEM # +FD97 ; 0646 062C 0645 ; MA # ( ‎ﶗ‎ → ‎نجم‎ ) ARABIC LIGATURE NOON WITH JEEM WITH MEEM FINAL FORM → ARABIC LETTER NOON, ARABIC LETTER JEEM, ARABIC LETTER MEEM # + +FD99 ; 0646 062C 0649 ; MA # ( ‎ﶙ‎ → ‎نجى‎ ) ARABIC LIGATURE NOON WITH JEEM WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER NOON, ARABIC LETTER JEEM, ARABIC LETTER ALEF MAKSURA # +FDC7 ; 0646 062C 0649 ; MA # ( ‎ﷇ‎ → ‎نجى‎ ) ARABIC LIGATURE NOON WITH JEEM WITH YEH FINAL FORM → ARABIC LETTER NOON, ARABIC LETTER JEEM, ARABIC LETTER ALEF MAKSURA # →‎نجي‎→ + +FCD3 ; 0646 062D ; MA # ( ‎ﳓ‎ → ‎نح‎ ) ARABIC LIGATURE NOON WITH HAH INITIAL FORM → ARABIC LETTER NOON, ARABIC LETTER HAH # +FC4C ; 0646 062D ; MA # ( ‎ﱌ‎ → ‎نح‎ ) ARABIC LIGATURE NOON WITH HAH ISOLATED FORM → ARABIC LETTER NOON, ARABIC LETTER HAH # + +FD95 ; 0646 062D 0645 ; MA # ( ‎ﶕ‎ → ‎نحم‎ ) ARABIC LIGATURE NOON WITH HAH WITH MEEM INITIAL FORM → ARABIC LETTER NOON, ARABIC LETTER HAH, ARABIC LETTER MEEM # + +FD96 ; 0646 062D 0649 ; MA # ( ‎ﶖ‎ → ‎نحى‎ ) ARABIC LIGATURE NOON WITH HAH WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER NOON, ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # +FDB3 ; 0646 062D 0649 ; MA # ( ‎ﶳ‎ → ‎نحى‎ ) ARABIC LIGATURE NOON WITH HAH WITH YEH FINAL FORM → ARABIC LETTER NOON, ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # →‎نحي‎→ + +FCD4 ; 0646 062E ; MA # ( ‎ﳔ‎ → ‎نخ‎ ) ARABIC LIGATURE NOON WITH KHAH INITIAL FORM → ARABIC LETTER NOON, ARABIC LETTER KHAH # +FC4D ; 0646 062E ; MA # ( ‎ﱍ‎ → ‎نخ‎ ) ARABIC LIGATURE NOON WITH KHAH ISOLATED FORM → ARABIC LETTER NOON, ARABIC LETTER KHAH # + +FC8A ; 0646 0631 ; MA # ( ‎ﲊ‎ → ‎نر‎ ) ARABIC LIGATURE NOON WITH REH FINAL FORM → ARABIC LETTER NOON, ARABIC LETTER REH # + +FC8B ; 0646 0632 ; MA # ( ‎ﲋ‎ → ‎نز‎ ) ARABIC LIGATURE NOON WITH ZAIN FINAL FORM → ARABIC LETTER NOON, ARABIC LETTER ZAIN # + +FCD5 ; 0646 0645 ; MA # ( ‎ﳕ‎ → ‎نم‎ ) ARABIC LIGATURE NOON WITH MEEM INITIAL FORM → ARABIC LETTER NOON, ARABIC LETTER MEEM # +FCEE ; 0646 0645 ; MA # ( ‎ﳮ‎ → ‎نم‎ ) ARABIC LIGATURE NOON WITH MEEM MEDIAL FORM → ARABIC LETTER NOON, ARABIC LETTER MEEM # +FC8C ; 0646 0645 ; MA # ( ‎ﲌ‎ → ‎نم‎ ) ARABIC LIGATURE NOON WITH MEEM FINAL FORM → ARABIC LETTER NOON, ARABIC LETTER MEEM # +FC4E ; 0646 0645 ; MA # ( ‎ﱎ‎ → ‎نم‎ ) ARABIC LIGATURE NOON WITH MEEM ISOLATED FORM → ARABIC LETTER NOON, ARABIC LETTER MEEM # + +FD9B ; 0646 0645 0649 ; MA # ( ‎ﶛ‎ → ‎نمى‎ ) ARABIC LIGATURE NOON WITH MEEM WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER NOON, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # +FD9A ; 0646 0645 0649 ; MA # ( ‎ﶚ‎ → ‎نمى‎ ) ARABIC LIGATURE NOON WITH MEEM WITH YEH FINAL FORM → ARABIC LETTER NOON, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # →‎نمي‎→ + +FC8D ; 0646 0646 ; MA # ( ‎ﲍ‎ → ‎نن‎ ) ARABIC LIGATURE NOON WITH NOON FINAL FORM → ARABIC LETTER NOON, ARABIC LETTER NOON # + +FC8E ; 0646 0649 ; MA # ( ‎ﲎ‎ → ‎نى‎ ) ARABIC LIGATURE NOON WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER NOON, ARABIC LETTER ALEF MAKSURA # +FC4F ; 0646 0649 ; MA # ( ‎ﱏ‎ → ‎نى‎ ) ARABIC LIGATURE NOON WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER NOON, ARABIC LETTER ALEF MAKSURA # +FC8F ; 0646 0649 ; MA # ( ‎ﲏ‎ → ‎نى‎ ) ARABIC LIGATURE NOON WITH YEH FINAL FORM → ARABIC LETTER NOON, ARABIC LETTER ALEF MAKSURA # →‎ني‎→ +FC50 ; 0646 0649 ; MA # ( ‎ﱐ‎ → ‎نى‎ ) ARABIC LIGATURE NOON WITH YEH ISOLATED FORM → ARABIC LETTER NOON, ARABIC LETTER ALEF MAKSURA # →‎ني‎→ + +06C2 ; 06C0 ; MA # ( ‎ۂ‎ → ‎ۀ‎ ) ARABIC LETTER HEH GOAL WITH HAMZA ABOVE → ARABIC LETTER HEH WITH YEH ABOVE # →‎ﮤ‎→ +FBA5 ; 06C0 ; MA # ( ‎ﮥ‎ → ‎ۀ‎ ) ARABIC LETTER HEH WITH YEH ABOVE FINAL FORM → ARABIC LETTER HEH WITH YEH ABOVE # +FBA4 ; 06C0 ; MA # ( ‎ﮤ‎ → ‎ۀ‎ ) ARABIC LETTER HEH WITH YEH ABOVE ISOLATED FORM → ARABIC LETTER HEH WITH YEH ABOVE # + +102E4 ; 0648 ; MA #* ( 𐋤 → ‎و‎ ) COPTIC EPACT DIGIT FOUR → ARABIC LETTER WAW # +1EE05 ; 0648 ; MA # ( ‎𞸅‎ → ‎و‎ ) ARABIC MATHEMATICAL WAW → ARABIC LETTER WAW # +1EE85 ; 0648 ; MA # ( ‎𞺅‎ → ‎و‎ ) ARABIC MATHEMATICAL LOOPED WAW → ARABIC LETTER WAW # +1EEA5 ; 0648 ; MA # ( ‎𞺥‎ → ‎و‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK WAW → ARABIC LETTER WAW # +FEEE ; 0648 ; MA # ( ‎ﻮ‎ → ‎و‎ ) ARABIC LETTER WAW FINAL FORM → ARABIC LETTER WAW # +FEED ; 0648 ; MA # ( ‎ﻭ‎ → ‎و‎ ) ARABIC LETTER WAW ISOLATED FORM → ARABIC LETTER WAW # +08B1 ; 0648 ; MA # ( ‎ࢱ‎ → ‎و‎ ) ARABIC LETTER STRAIGHT WAW → ARABIC LETTER WAW # + +06CB ; 0648 06DB ; MA # ( ‎ۋ‎ → ‎وۛ‎ ) ARABIC LETTER VE → ARABIC LETTER WAW, ARABIC SMALL HIGH THREE DOTS # +FBDF ; 0648 06DB ; MA # ( ‎ﯟ‎ → ‎وۛ‎ ) ARABIC LETTER VE FINAL FORM → ARABIC LETTER WAW, ARABIC SMALL HIGH THREE DOTS # →‎ۋ‎→ +FBDE ; 0648 06DB ; MA # ( ‎ﯞ‎ → ‎وۛ‎ ) ARABIC LETTER VE ISOLATED FORM → ARABIC LETTER WAW, ARABIC SMALL HIGH THREE DOTS # →‎ۋ‎→ + +06C7 ; 0648 0313 ; MA # ( ‎ۇ‎ → ‎و̓‎ ) ARABIC LETTER U → ARABIC LETTER WAW, COMBINING COMMA ABOVE # →‎وُ‎→ +FBD8 ; 0648 0313 ; MA # ( ‎ﯘ‎ → ‎و̓‎ ) ARABIC LETTER U FINAL FORM → ARABIC LETTER WAW, COMBINING COMMA ABOVE # →‎ۇ‎→→‎وُ‎→ +FBD7 ; 0648 0313 ; MA # ( ‎ﯗ‎ → ‎و̓‎ ) ARABIC LETTER U ISOLATED FORM → ARABIC LETTER WAW, COMBINING COMMA ABOVE # →‎ۇ‎→→‎وُ‎→ + +06C6 ; 0648 0306 ; MA # ( ‎ۆ‎ → ‎و̆‎ ) ARABIC LETTER OE → ARABIC LETTER WAW, COMBINING BREVE # →‎وٚ‎→ +FBDA ; 0648 0306 ; MA # ( ‎ﯚ‎ → ‎و̆‎ ) ARABIC LETTER OE FINAL FORM → ARABIC LETTER WAW, COMBINING BREVE # →‎ۆ‎→→‎وٚ‎→ +FBD9 ; 0648 0306 ; MA # ( ‎ﯙ‎ → ‎و̆‎ ) ARABIC LETTER OE ISOLATED FORM → ARABIC LETTER WAW, COMBINING BREVE # →‎ۆ‎→→‎وٚ‎→ + +06C9 ; 0648 0302 ; MA # ( ‎ۉ‎ → ‎و̂‎ ) ARABIC LETTER KIRGHIZ YU → ARABIC LETTER WAW, COMBINING CIRCUMFLEX ACCENT # →‎وٛ‎→ +FBE3 ; 0648 0302 ; MA # ( ‎ﯣ‎ → ‎و̂‎ ) ARABIC LETTER KIRGHIZ YU FINAL FORM → ARABIC LETTER WAW, COMBINING CIRCUMFLEX ACCENT # →‎ۉ‎→→‎وٛ‎→ +FBE2 ; 0648 0302 ; MA # ( ‎ﯢ‎ → ‎و̂‎ ) ARABIC LETTER KIRGHIZ YU ISOLATED FORM → ARABIC LETTER WAW, COMBINING CIRCUMFLEX ACCENT # →‎ۉ‎→→‎وٛ‎→ + +06C8 ; 0648 0670 ; MA # ( ‎ۈ‎ → ‎وٰ‎ ) ARABIC LETTER YU → ARABIC LETTER WAW, ARABIC LETTER SUPERSCRIPT ALEF # +FBDC ; 0648 0670 ; MA # ( ‎ﯜ‎ → ‎وٰ‎ ) ARABIC LETTER YU FINAL FORM → ARABIC LETTER WAW, ARABIC LETTER SUPERSCRIPT ALEF # →‎ۈ‎→ +FBDB ; 0648 0670 ; MA # ( ‎ﯛ‎ → ‎وٰ‎ ) ARABIC LETTER YU ISOLATED FORM → ARABIC LETTER WAW, ARABIC LETTER SUPERSCRIPT ALEF # →‎ۈ‎→ + +0676 ; 0648 0674 ; MA # ( ‎ٶ‎ → ‎وٴ‎ ) ARABIC LETTER HIGH HAMZA WAW → ARABIC LETTER WAW, ARABIC LETTER HIGH HAMZA # +0624 ; 0648 0674 ; MA # ( ‎ؤ‎ → ‎وٴ‎ ) ARABIC LETTER WAW WITH HAMZA ABOVE → ARABIC LETTER WAW, ARABIC LETTER HIGH HAMZA # →‎ٶ‎→ +FE86 ; 0648 0674 ; MA # ( ‎ﺆ‎ → ‎وٴ‎ ) ARABIC LETTER WAW WITH HAMZA ABOVE FINAL FORM → ARABIC LETTER WAW, ARABIC LETTER HIGH HAMZA # →‎ٶ‎→ +FE85 ; 0648 0674 ; MA # ( ‎ﺅ‎ → ‎وٴ‎ ) ARABIC LETTER WAW WITH HAMZA ABOVE ISOLATED FORM → ARABIC LETTER WAW, ARABIC LETTER HIGH HAMZA # →‎ٶ‎→ + +0677 ; 0648 0313 0674 ; MA # ( ‎ٷ‎ → ‎و̓ٴ‎ ) ARABIC LETTER U WITH HAMZA ABOVE → ARABIC LETTER WAW, COMBINING COMMA ABOVE, ARABIC LETTER HIGH HAMZA # →‎ۇٴ‎→ +FBDD ; 0648 0313 0674 ; MA # ( ‎ﯝ‎ → ‎و̓ٴ‎ ) ARABIC LETTER U WITH HAMZA ABOVE ISOLATED FORM → ARABIC LETTER WAW, COMBINING COMMA ABOVE, ARABIC LETTER HIGH HAMZA # →‎ۇٴ‎→ + +FDF8 ; 0648 0633 0644 0645 ; MA # ( ‎ﷸ‎ → ‎وسلم‎ ) ARABIC LIGATURE WASALLAM ISOLATED FORM → ARABIC LETTER WAW, ARABIC LETTER SEEN, ARABIC LETTER LAM, ARABIC LETTER MEEM # + +FBE1 ; 06C5 ; MA # ( ‎ﯡ‎ → ‎ۅ‎ ) ARABIC LETTER KIRGHIZ OE FINAL FORM → ARABIC LETTER KIRGHIZ OE # +FBE0 ; 06C5 ; MA # ( ‎ﯠ‎ → ‎ۅ‎ ) ARABIC LETTER KIRGHIZ OE ISOLATED FORM → ARABIC LETTER KIRGHIZ OE # + +066E ; 0649 ; MA # ( ‎ٮ‎ → ‎ى‎ ) ARABIC LETTER DOTLESS BEH → ARABIC LETTER ALEF MAKSURA # +1EE1C ; 0649 ; MA # ( ‎𞸜‎ → ‎ى‎ ) ARABIC MATHEMATICAL DOTLESS BEH → ARABIC LETTER ALEF MAKSURA # →‎ٮ‎→ +1EE7C ; 0649 ; MA # ( ‎𞹼‎ → ‎ى‎ ) ARABIC MATHEMATICAL STRETCHED DOTLESS BEH → ARABIC LETTER ALEF MAKSURA # →‎ٮ‎→ +06BA ; 0649 ; MA # ( ‎ں‎ → ‎ى‎ ) ARABIC LETTER NOON GHUNNA → ARABIC LETTER ALEF MAKSURA # +1EE1D ; 0649 ; MA # ( ‎𞸝‎ → ‎ى‎ ) ARABIC MATHEMATICAL DOTLESS NOON → ARABIC LETTER ALEF MAKSURA # →‎ں‎→ +1EE5D ; 0649 ; MA # ( ‎𞹝‎ → ‎ى‎ ) ARABIC MATHEMATICAL TAILED DOTLESS NOON → ARABIC LETTER ALEF MAKSURA # →‎ں‎→ +FB9F ; 0649 ; MA # ( ‎ﮟ‎ → ‎ى‎ ) ARABIC LETTER NOON GHUNNA FINAL FORM → ARABIC LETTER ALEF MAKSURA # →‎ں‎→ +FB9E ; 0649 ; MA # ( ‎ﮞ‎ → ‎ى‎ ) ARABIC LETTER NOON GHUNNA ISOLATED FORM → ARABIC LETTER ALEF MAKSURA # →‎ں‎→ +08BD ; 0649 ; MA # ( ‎ࢽ‎ → ‎ى‎ ) ARABIC LETTER AFRICAN NOON → ARABIC LETTER ALEF MAKSURA # →‎ں‎→ +FBE8 ; 0649 ; MA # ( ‎ﯨ‎ → ‎ى‎ ) ARABIC LETTER UIGHUR KAZAKH KIRGHIZ ALEF MAKSURA INITIAL FORM → ARABIC LETTER ALEF MAKSURA # +FBE9 ; 0649 ; MA # ( ‎ﯩ‎ → ‎ى‎ ) ARABIC LETTER UIGHUR KAZAKH KIRGHIZ ALEF MAKSURA MEDIAL FORM → ARABIC LETTER ALEF MAKSURA # +FEF0 ; 0649 ; MA # ( ‎ﻰ‎ → ‎ى‎ ) ARABIC LETTER ALEF MAKSURA FINAL FORM → ARABIC LETTER ALEF MAKSURA # +FEEF ; 0649 ; MA # ( ‎ﻯ‎ → ‎ى‎ ) ARABIC LETTER ALEF MAKSURA ISOLATED FORM → ARABIC LETTER ALEF MAKSURA # +064A ; 0649 ; MA # ( ‎ي‎ → ‎ى‎ ) ARABIC LETTER YEH → ARABIC LETTER ALEF MAKSURA # +1EE09 ; 0649 ; MA # ( ‎𞸉‎ → ‎ى‎ ) ARABIC MATHEMATICAL YEH → ARABIC LETTER ALEF MAKSURA # →‎ي‎→ +1EE29 ; 0649 ; MA # ( ‎𞸩‎ → ‎ى‎ ) ARABIC MATHEMATICAL INITIAL YEH → ARABIC LETTER ALEF MAKSURA # →‎ي‎→ +1EE49 ; 0649 ; MA # ( ‎𞹉‎ → ‎ى‎ ) ARABIC MATHEMATICAL TAILED YEH → ARABIC LETTER ALEF MAKSURA # →‎ي‎→ +1EE69 ; 0649 ; MA # ( ‎𞹩‎ → ‎ى‎ ) ARABIC MATHEMATICAL STRETCHED YEH → ARABIC LETTER ALEF MAKSURA # →‎ي‎→ +1EE89 ; 0649 ; MA # ( ‎𞺉‎ → ‎ى‎ ) ARABIC MATHEMATICAL LOOPED YEH → ARABIC LETTER ALEF MAKSURA # →‎ي‎→ +1EEA9 ; 0649 ; MA # ( ‎𞺩‎ → ‎ى‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK YEH → ARABIC LETTER ALEF MAKSURA # →‎ي‎→ +FEF3 ; 0649 ; MA # ( ‎ﻳ‎ → ‎ى‎ ) ARABIC LETTER YEH INITIAL FORM → ARABIC LETTER ALEF MAKSURA # →‎ي‎→ +FEF4 ; 0649 ; MA # ( ‎ﻴ‎ → ‎ى‎ ) ARABIC LETTER YEH MEDIAL FORM → ARABIC LETTER ALEF MAKSURA # →‎ي‎→ +FEF2 ; 0649 ; MA # ( ‎ﻲ‎ → ‎ى‎ ) ARABIC LETTER YEH FINAL FORM → ARABIC LETTER ALEF MAKSURA # →‎ي‎→ +FEF1 ; 0649 ; MA # ( ‎ﻱ‎ → ‎ى‎ ) ARABIC LETTER YEH ISOLATED FORM → ARABIC LETTER ALEF MAKSURA # →‎ي‎→ +06CC ; 0649 ; MA # ( ‎ی‎ → ‎ى‎ ) ARABIC LETTER FARSI YEH → ARABIC LETTER ALEF MAKSURA # +FBFE ; 0649 ; MA # ( ‎ﯾ‎ → ‎ى‎ ) ARABIC LETTER FARSI YEH INITIAL FORM → ARABIC LETTER ALEF MAKSURA # →‎ی‎→ +FBFF ; 0649 ; MA # ( ‎ﯿ‎ → ‎ى‎ ) ARABIC LETTER FARSI YEH MEDIAL FORM → ARABIC LETTER ALEF MAKSURA # →‎ی‎→ +FBFD ; 0649 ; MA # ( ‎ﯽ‎ → ‎ى‎ ) ARABIC LETTER FARSI YEH FINAL FORM → ARABIC LETTER ALEF MAKSURA # →‎ﻰ‎→ +FBFC ; 0649 ; MA # ( ‎ﯼ‎ → ‎ى‎ ) ARABIC LETTER FARSI YEH ISOLATED FORM → ARABIC LETTER ALEF MAKSURA # +06D2 ; 0649 ; MA # ( ‎ے‎ → ‎ى‎ ) ARABIC LETTER YEH BARREE → ARABIC LETTER ALEF MAKSURA # →‎ي‎→ +FBAF ; 0649 ; MA # ( ‎ﮯ‎ → ‎ى‎ ) ARABIC LETTER YEH BARREE FINAL FORM → ARABIC LETTER ALEF MAKSURA # →‎ے‎→→‎ي‎→ +FBAE ; 0649 ; MA # ( ‎ﮮ‎ → ‎ى‎ ) ARABIC LETTER YEH BARREE ISOLATED FORM → ARABIC LETTER ALEF MAKSURA # →‎ے‎→→‎ي‎→ + +0679 ; 0649 0615 ; MA # ( ‎ٹ‎ → ‎ىؕ‎ ) ARABIC LETTER TTEH → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH TAH # →‎ٮؕ‎→ +FB68 ; 0649 0615 ; MA # ( ‎ﭨ‎ → ‎ىؕ‎ ) ARABIC LETTER TTEH INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH TAH # →‎ٹ‎→→‎ٮؕ‎→ +FB69 ; 0649 0615 ; MA # ( ‎ﭩ‎ → ‎ىؕ‎ ) ARABIC LETTER TTEH MEDIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH TAH # →‎ٹ‎→→‎ٮؕ‎→ +FB67 ; 0649 0615 ; MA # ( ‎ﭧ‎ → ‎ىؕ‎ ) ARABIC LETTER TTEH FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH TAH # →‎ٹ‎→→‎ٮؕ‎→ +FB66 ; 0649 0615 ; MA # ( ‎ﭦ‎ → ‎ىؕ‎ ) ARABIC LETTER TTEH ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH TAH # →‎ٹ‎→→‎ٮؕ‎→ +06BB ; 0649 0615 ; MA # ( ‎ڻ‎ → ‎ىؕ‎ ) ARABIC LETTER RNOON → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH TAH # →‎ںؕ‎→ +FBA2 ; 0649 0615 ; MA # ( ‎ﮢ‎ → ‎ىؕ‎ ) ARABIC LETTER RNOON INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH TAH # →‎ڻ‎→→‎ںؕ‎→ +FBA3 ; 0649 0615 ; MA # ( ‎ﮣ‎ → ‎ىؕ‎ ) ARABIC LETTER RNOON MEDIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH TAH # →‎ڻ‎→→‎ںؕ‎→ +FBA1 ; 0649 0615 ; MA # ( ‎ﮡ‎ → ‎ىؕ‎ ) ARABIC LETTER RNOON FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH TAH # →‎ڻ‎→→‎ںؕ‎→ +FBA0 ; 0649 0615 ; MA # ( ‎ﮠ‎ → ‎ىؕ‎ ) ARABIC LETTER RNOON ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH TAH # →‎ڻ‎→→‎ںؕ‎→ + +067E ; 0649 06DB ; MA # ( ‎پ‎ → ‎ىۛ‎ ) ARABIC LETTER PEH → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎ڽ‎→→‎ںۛ‎→ +FB58 ; 0649 06DB ; MA # ( ‎ﭘ‎ → ‎ىۛ‎ ) ARABIC LETTER PEH INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎پ‎→→‎ڽ‎→→‎ںۛ‎→ +FB59 ; 0649 06DB ; MA # ( ‎ﭙ‎ → ‎ىۛ‎ ) ARABIC LETTER PEH MEDIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎پ‎→→‎ڽ‎→→‎ںۛ‎→ +FB57 ; 0649 06DB ; MA # ( ‎ﭗ‎ → ‎ىۛ‎ ) ARABIC LETTER PEH FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎پ‎→→‎ڽ‎→→‎ںۛ‎→ +FB56 ; 0649 06DB ; MA # ( ‎ﭖ‎ → ‎ىۛ‎ ) ARABIC LETTER PEH ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎پ‎→→‎ڽ‎→→‎ںۛ‎→ +0752 ; 0649 06DB ; MA # ( ‎ݒ‎ → ‎ىۛ‎ ) ARABIC LETTER BEH WITH THREE DOTS POINTING UPWARDS BELOW → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎پ‎→→‎ڽ‎→→‎ںۛ‎→ +062B ; 0649 06DB ; MA # ( ‎ث‎ → ‎ىۛ‎ ) ARABIC LETTER THEH → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎ٮۛ‎→ +1EE16 ; 0649 06DB ; MA # ( ‎𞸖‎ → ‎ىۛ‎ ) ARABIC MATHEMATICAL THEH → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎ث‎→→‎ٮۛ‎→ +1EE36 ; 0649 06DB ; MA # ( ‎𞸶‎ → ‎ىۛ‎ ) ARABIC MATHEMATICAL INITIAL THEH → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎ث‎→→‎ٮۛ‎→ +1EE76 ; 0649 06DB ; MA # ( ‎𞹶‎ → ‎ىۛ‎ ) ARABIC MATHEMATICAL STRETCHED THEH → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎ث‎→→‎ٮۛ‎→ +1EE96 ; 0649 06DB ; MA # ( ‎𞺖‎ → ‎ىۛ‎ ) ARABIC MATHEMATICAL LOOPED THEH → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎ث‎→→‎ٮۛ‎→ +1EEB6 ; 0649 06DB ; MA # ( ‎𞺶‎ → ‎ىۛ‎ ) ARABIC MATHEMATICAL DOUBLE-STRUCK THEH → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎ث‎→→‎ٮۛ‎→ +FE9B ; 0649 06DB ; MA # ( ‎ﺛ‎ → ‎ىۛ‎ ) ARABIC LETTER THEH INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎ث‎→→‎ٮۛ‎→ +FE9C ; 0649 06DB ; MA # ( ‎ﺜ‎ → ‎ىۛ‎ ) ARABIC LETTER THEH MEDIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎ث‎→→‎ٮۛ‎→ +FE9A ; 0649 06DB ; MA # ( ‎ﺚ‎ → ‎ىۛ‎ ) ARABIC LETTER THEH FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎ث‎→→‎ٮۛ‎→ +FE99 ; 0649 06DB ; MA # ( ‎ﺙ‎ → ‎ىۛ‎ ) ARABIC LETTER THEH ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎ث‎→→‎ٮۛ‎→ +06BD ; 0649 06DB ; MA # ( ‎ڽ‎ → ‎ىۛ‎ ) ARABIC LETTER NOON WITH THREE DOTS ABOVE → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎ںۛ‎→ +06D1 ; 0649 06DB ; MA # ( ‎ۑ‎ → ‎ىۛ‎ ) ARABIC LETTER YEH WITH THREE DOTS BELOW → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎پ‎→→‎ڽ‎→→‎ںۛ‎→ +063F ; 0649 06DB ; MA # ( ‎ؿ‎ → ‎ىۛ‎ ) ARABIC LETTER FARSI YEH WITH THREE DOTS ABOVE → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS # →‎یۛ‎→ + +08B7 ; 0649 06DB 06E2 ; MA # ( ‎ࢷ‎ → ‎ىۛۢ‎ ) ARABIC LETTER PEH WITH SMALL MEEM ABOVE → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS, ARABIC SMALL HIGH MEEM ISOLATED FORM # →‎پۢ‎→ + +0756 ; 0649 0306 ; MA # ( ‎ݖ‎ → ‎ى̆‎ ) ARABIC LETTER BEH WITH SMALL V → ARABIC LETTER ALEF MAKSURA, COMBINING BREVE # →‎ٮٚ‎→ +06CE ; 0649 0306 ; MA # ( ‎ێ‎ → ‎ى̆‎ ) ARABIC LETTER YEH WITH SMALL V → ARABIC LETTER ALEF MAKSURA, COMBINING BREVE # →‎یٚ‎→ + +08C0 ; 0649 0615 0306 ; MA # ( ‎ࣀ‎ → ‎ىؕ̆‎ ) ARABIC LETTER TTEH WITH SMALL V → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH TAH, COMBINING BREVE # →‎ٹٚ‎→ + +08BE ; 0649 06DB 0306 ; MA # ( ‎ࢾ‎ → ‎ىۛ̆‎ ) ARABIC LETTER PEH WITH SMALL V → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS, COMBINING BREVE # →‎پٚ‎→ + +08BA ; 0649 0306 0307 ; MA # ( ‎ࢺ‎ → ‎ى̆̇‎ ) ARABIC LETTER YEH WITH TWO DOTS BELOW AND SMALL NOON ABOVE → ARABIC LETTER ALEF MAKSURA, COMBINING BREVE, COMBINING DOT ABOVE # →‎يۨ‎→ + +063D ; 0649 0302 ; MA # ( ‎ؽ‎ → ‎ى̂‎ ) ARABIC LETTER FARSI YEH WITH INVERTED V → ARABIC LETTER ALEF MAKSURA, COMBINING CIRCUMFLEX ACCENT # →‎یٛ‎→ + +088F ; 0649 030A ; MA # ( ‎࢏‎ → ‎ى̊‎ ) ARABIC LETTER NOON WITH RING ABOVE → ARABIC LETTER ALEF MAKSURA, COMBINING RING ABOVE # →‎ں̊‎→ + +08A8 ; 0649 0654 ; MA # ( ‎ࢨ‎ → ‎ىٔ‎ ) ARABIC LETTER YEH WITH TWO DOTS BELOW AND HAMZA ABOVE → ARABIC LETTER ALEF MAKSURA, ARABIC HAMZA ABOVE # →‎ئ‎→ + +FC90 ; 0649 0670 ; MA # ( ‎ﲐ‎ → ‎ىٰ‎ ) ARABIC LIGATURE ALEF MAKSURA WITH SUPERSCRIPT ALEF FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER SUPERSCRIPT ALEF # +FC5D ; 0649 0670 ; MA # ( ‎ﱝ‎ → ‎ىٰ‎ ) ARABIC LIGATURE ALEF MAKSURA WITH SUPERSCRIPT ALEF ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER SUPERSCRIPT ALEF # + +FCDE ; 0649 006F ; MA # ( ‎ﳞ‎ → ‎ىo‎ ) ARABIC LIGATURE YEH WITH HEH INITIAL FORM → ARABIC LETTER ALEF MAKSURA, LATIN SMALL LETTER O # →‎يه‎→ +FCF1 ; 0649 006F ; MA # ( ‎ﳱ‎ → ‎ىo‎ ) ARABIC LIGATURE YEH WITH HEH MEDIAL FORM → ARABIC LETTER ALEF MAKSURA, LATIN SMALL LETTER O # →‎يه‎→ + +FCE6 ; 0649 06DB 006F ; MA # ( ‎ﳦ‎ → ‎ىۛo‎ ) ARABIC LIGATURE THEH WITH HEH MEDIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS, LATIN SMALL LETTER O # →‎ثه‎→ + +0678 ; 0649 0674 ; MA # ( ‎ٸ‎ → ‎ىٴ‎ ) ARABIC LETTER HIGH HAMZA YEH → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA # →‎يٴ‎→ +0626 ; 0649 0674 ; MA # ( ‎ئ‎ → ‎ىٴ‎ ) ARABIC LETTER YEH WITH HAMZA ABOVE → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA # →‎ٸ‎→→‎يٴ‎→ +FE8B ; 0649 0674 ; MA # ( ‎ﺋ‎ → ‎ىٴ‎ ) ARABIC LETTER YEH WITH HAMZA ABOVE INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA # →‎ئ‎→→‎ٸ‎→→‎يٴ‎→ +FE8C ; 0649 0674 ; MA # ( ‎ﺌ‎ → ‎ىٴ‎ ) ARABIC LETTER YEH WITH HAMZA ABOVE MEDIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA # →‎ئ‎→→‎ٸ‎→→‎يٴ‎→ +FE8A ; 0649 0674 ; MA # ( ‎ﺊ‎ → ‎ىٴ‎ ) ARABIC LETTER YEH WITH HAMZA ABOVE FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA # →‎ئ‎→→‎ٸ‎→→‎يٴ‎→ +FE89 ; 0649 0674 ; MA # ( ‎ﺉ‎ → ‎ىٴ‎ ) ARABIC LETTER YEH WITH HAMZA ABOVE ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA # →‎ٸ‎→→‎يٴ‎→ + +FBEB ; 0649 0674 006C ; MA # ( ‎ﯫ‎ → ‎ىٴl‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH ALEF FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, LATIN SMALL LETTER L # →‎ئا‎→ +FBEA ; 0649 0674 006C ; MA # ( ‎ﯪ‎ → ‎ىٴl‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH ALEF ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, LATIN SMALL LETTER L # →‎ئا‎→ + +FC9B ; 0649 0674 006F ; MA # ( ‎ﲛ‎ → ‎ىٴo‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH HEH INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, LATIN SMALL LETTER O # →‎ئه‎→ +FCE0 ; 0649 0674 006F ; MA # ( ‎ﳠ‎ → ‎ىٴo‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH HEH MEDIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, LATIN SMALL LETTER O # →‎ئه‎→ +FBED ; 0649 0674 006F ; MA # ( ‎ﯭ‎ → ‎ىٴo‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH AE FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, LATIN SMALL LETTER O # →‎ئە‎→→‎ٴىo‎→→‎ئه‎→ +FBEC ; 0649 0674 006F ; MA # ( ‎ﯬ‎ → ‎ىٴo‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH AE ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, LATIN SMALL LETTER O # →‎ئە‎→→‎ٴىo‎→→‎ئه‎→ + +FBF8 ; 0649 0674 067B ; MA # ( ‎ﯸ‎ → ‎ىٴٻ‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH E INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER BEEH # →‎ئې‎→ +FBF7 ; 0649 0674 067B ; MA # ( ‎ﯷ‎ → ‎ىٴٻ‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH E FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER BEEH # →‎ئې‎→ +FBF6 ; 0649 0674 067B ; MA # ( ‎ﯶ‎ → ‎ىٴٻ‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH E ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER BEEH # →‎ئې‎→ + +FC97 ; 0649 0674 062C ; MA # ( ‎ﲗ‎ → ‎ىٴج‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH JEEM INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER JEEM # →‎ئج‎→ +FC00 ; 0649 0674 062C ; MA # ( ‎ﰀ‎ → ‎ىٴج‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH JEEM ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER JEEM # →‎ئج‎→ + +FC98 ; 0649 0674 062D ; MA # ( ‎ﲘ‎ → ‎ىٴح‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH HAH INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER HAH # →‎ئح‎→ +FC01 ; 0649 0674 062D ; MA # ( ‎ﰁ‎ → ‎ىٴح‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH HAH ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER HAH # →‎ئح‎→ + +FC99 ; 0649 0674 062E ; MA # ( ‎ﲙ‎ → ‎ىٴخ‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH KHAH INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER KHAH # →‎ئخ‎→ + +FC64 ; 0649 0674 0631 ; MA # ( ‎ﱤ‎ → ‎ىٴر‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH REH FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER REH # →‎ئر‎→ + +FC65 ; 0649 0674 0632 ; MA # ( ‎ﱥ‎ → ‎ىٴز‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH ZAIN FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER ZAIN # →‎ئز‎→ + +FC9A ; 0649 0674 0645 ; MA # ( ‎ﲚ‎ → ‎ىٴم‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH MEEM INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER MEEM # →‎ئم‎→ +FCDF ; 0649 0674 0645 ; MA # ( ‎ﳟ‎ → ‎ىٴم‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH MEEM MEDIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER MEEM # →‎ئم‎→ +FC66 ; 0649 0674 0645 ; MA # ( ‎ﱦ‎ → ‎ىٴم‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH MEEM FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER MEEM # →‎ئم‎→ +FC02 ; 0649 0674 0645 ; MA # ( ‎ﰂ‎ → ‎ىٴم‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH MEEM ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER MEEM # →‎ئم‎→ + +FC67 ; 0649 0674 0646 ; MA # ( ‎ﱧ‎ → ‎ىٴن‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH NOON FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER NOON # →‎ئن‎→ + +FBEF ; 0649 0674 0648 ; MA # ( ‎ﯯ‎ → ‎ىٴو‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH WAW FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER WAW # →‎ئو‎→ +FBEE ; 0649 0674 0648 ; MA # ( ‎ﯮ‎ → ‎ىٴو‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH WAW ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER WAW # →‎ئو‎→ + +FBF1 ; 0649 0674 0648 0313 ; MA # ( ‎ﯱ‎ → ‎ىٴو̓‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH U FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER WAW, COMBINING COMMA ABOVE # →‎ئۇ‎→ +FBF0 ; 0649 0674 0648 0313 ; MA # ( ‎ﯰ‎ → ‎ىٴو̓‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH U ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER WAW, COMBINING COMMA ABOVE # →‎ئۇ‎→ + +FBF3 ; 0649 0674 0648 0306 ; MA # ( ‎ﯳ‎ → ‎ىٴو̆‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH OE FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER WAW, COMBINING BREVE # →‎ئۆ‎→ +FBF2 ; 0649 0674 0648 0306 ; MA # ( ‎ﯲ‎ → ‎ىٴو̆‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH OE ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER WAW, COMBINING BREVE # →‎ئۆ‎→ + +FBF5 ; 0649 0674 0648 0670 ; MA # ( ‎ﯵ‎ → ‎ىٴوٰ‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH YU FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER WAW, ARABIC LETTER SUPERSCRIPT ALEF # →‎ئۈ‎→ +FBF4 ; 0649 0674 0648 0670 ; MA # ( ‎ﯴ‎ → ‎ىٴوٰ‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH YU ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER WAW, ARABIC LETTER SUPERSCRIPT ALEF # →‎ئۈ‎→ + +FBFB ; 0649 0674 0649 ; MA # ( ‎ﯻ‎ → ‎ىٴى‎ ) ARABIC LIGATURE UIGHUR KIRGHIZ YEH WITH HAMZA ABOVE WITH ALEF MAKSURA INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER ALEF MAKSURA # →‎ئى‎→ +FBFA ; 0649 0674 0649 ; MA # ( ‎ﯺ‎ → ‎ىٴى‎ ) ARABIC LIGATURE UIGHUR KIRGHIZ YEH WITH HAMZA ABOVE WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER ALEF MAKSURA # →‎ئى‎→ +FC68 ; 0649 0674 0649 ; MA # ( ‎ﱨ‎ → ‎ىٴى‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER ALEF MAKSURA # →‎ئى‎→ +FBF9 ; 0649 0674 0649 ; MA # ( ‎ﯹ‎ → ‎ىٴى‎ ) ARABIC LIGATURE UIGHUR KIRGHIZ YEH WITH HAMZA ABOVE WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER ALEF MAKSURA # →‎ئى‎→ +FC03 ; 0649 0674 0649 ; MA # ( ‎ﰃ‎ → ‎ىٴى‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER ALEF MAKSURA # →‎ئى‎→ +FC69 ; 0649 0674 0649 ; MA # ( ‎ﱩ‎ → ‎ىٴى‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH YEH FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER ALEF MAKSURA # →‎ئي‎→→‎ٴىى‎→→‎ئى‎→ +FC04 ; 0649 0674 0649 ; MA # ( ‎ﰄ‎ → ‎ىٴى‎ ) ARABIC LIGATURE YEH WITH HAMZA ABOVE WITH YEH ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HIGH HAMZA, ARABIC LETTER ALEF MAKSURA # →‎ئي‎→→‎ٴىى‎→→‎ئى‎→ + +FCDA ; 0649 062C ; MA # ( ‎ﳚ‎ → ‎ىج‎ ) ARABIC LIGATURE YEH WITH JEEM INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER JEEM # →‎يج‎→ +FC55 ; 0649 062C ; MA # ( ‎ﱕ‎ → ‎ىج‎ ) ARABIC LIGATURE YEH WITH JEEM ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER JEEM # →‎يج‎→ + +FC11 ; 0649 06DB 062C ; MA # ( ‎ﰑ‎ → ‎ىۛج‎ ) ARABIC LIGATURE THEH WITH JEEM ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER JEEM # →‎ثج‎→ + +FDAF ; 0649 062C 0649 ; MA # ( ‎ﶯ‎ → ‎ىجى‎ ) ARABIC LIGATURE YEH WITH JEEM WITH YEH FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER JEEM, ARABIC LETTER ALEF MAKSURA # →‎يجي‎→ + +FCDB ; 0649 062D ; MA # ( ‎ﳛ‎ → ‎ىح‎ ) ARABIC LIGATURE YEH WITH HAH INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HAH # →‎يح‎→ +FC56 ; 0649 062D ; MA # ( ‎ﱖ‎ → ‎ىح‎ ) ARABIC LIGATURE YEH WITH HAH ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HAH # →‎يح‎→ + +FDAE ; 0649 062D 0649 ; MA # ( ‎ﶮ‎ → ‎ىحى‎ ) ARABIC LIGATURE YEH WITH HAH WITH YEH FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER HAH, ARABIC LETTER ALEF MAKSURA # →‎يحي‎→ + +FCDC ; 0649 062E ; MA # ( ‎ﳜ‎ → ‎ىخ‎ ) ARABIC LIGATURE YEH WITH KHAH INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER KHAH # →‎يخ‎→ +FC57 ; 0649 062E ; MA # ( ‎ﱗ‎ → ‎ىخ‎ ) ARABIC LIGATURE YEH WITH KHAH ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER KHAH # →‎يخ‎→ + +FC91 ; 0649 0631 ; MA # ( ‎ﲑ‎ → ‎ىر‎ ) ARABIC LIGATURE YEH WITH REH FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER REH # →‎ير‎→ + +FC76 ; 0649 06DB 0631 ; MA # ( ‎ﱶ‎ → ‎ىۛر‎ ) ARABIC LIGATURE THEH WITH REH FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER REH # →‎ثر‎→ + +FC92 ; 0649 0632 ; MA # ( ‎ﲒ‎ → ‎ىز‎ ) ARABIC LIGATURE YEH WITH ZAIN FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER ZAIN # →‎يز‎→ + +FC77 ; 0649 06DB 0632 ; MA # ( ‎ﱷ‎ → ‎ىۛز‎ ) ARABIC LIGATURE THEH WITH ZAIN FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER ZAIN # →‎ثز‎→ + +FCDD ; 0649 0645 ; MA # ( ‎ﳝ‎ → ‎ىم‎ ) ARABIC LIGATURE YEH WITH MEEM INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER MEEM # →‎يم‎→ +FCF0 ; 0649 0645 ; MA # ( ‎ﳰ‎ → ‎ىم‎ ) ARABIC LIGATURE YEH WITH MEEM MEDIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER MEEM # →‎يم‎→ +FC93 ; 0649 0645 ; MA # ( ‎ﲓ‎ → ‎ىم‎ ) ARABIC LIGATURE YEH WITH MEEM FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER MEEM # →‎يم‎→ +FC58 ; 0649 0645 ; MA # ( ‎ﱘ‎ → ‎ىم‎ ) ARABIC LIGATURE YEH WITH MEEM ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER MEEM # →‎يم‎→ + +FCA6 ; 0649 06DB 0645 ; MA # ( ‎ﲦ‎ → ‎ىۛم‎ ) ARABIC LIGATURE THEH WITH MEEM INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER MEEM # →‎ثم‎→ +FCE5 ; 0649 06DB 0645 ; MA # ( ‎ﳥ‎ → ‎ىۛم‎ ) ARABIC LIGATURE THEH WITH MEEM MEDIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER MEEM # →‎ثم‎→ +FC78 ; 0649 06DB 0645 ; MA # ( ‎ﱸ‎ → ‎ىۛم‎ ) ARABIC LIGATURE THEH WITH MEEM FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER MEEM # →‎ثم‎→ +FC12 ; 0649 06DB 0645 ; MA # ( ‎ﰒ‎ → ‎ىۛم‎ ) ARABIC LIGATURE THEH WITH MEEM ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER MEEM # →‎ثم‎→ + +FD9D ; 0649 0645 0645 ; MA # ( ‎ﶝ‎ → ‎ىمم‎ ) ARABIC LIGATURE YEH WITH MEEM WITH MEEM INITIAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER MEEM, ARABIC LETTER MEEM # →‎يمم‎→ +FD9C ; 0649 0645 0645 ; MA # ( ‎ﶜ‎ → ‎ىمم‎ ) ARABIC LIGATURE YEH WITH MEEM WITH MEEM FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER MEEM, ARABIC LETTER MEEM # →‎يمم‎→ + +FDB0 ; 0649 0645 0649 ; MA # ( ‎ﶰ‎ → ‎ىمى‎ ) ARABIC LIGATURE YEH WITH MEEM WITH YEH FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER MEEM, ARABIC LETTER ALEF MAKSURA # →‎يمي‎→ + +FC94 ; 0649 0646 ; MA # ( ‎ﲔ‎ → ‎ىن‎ ) ARABIC LIGATURE YEH WITH NOON FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER NOON # →‎ين‎→ + +FC79 ; 0649 06DB 0646 ; MA # ( ‎ﱹ‎ → ‎ىۛن‎ ) ARABIC LIGATURE THEH WITH NOON FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER NOON # →‎ثن‎→ + +FC95 ; 0649 0649 ; MA # ( ‎ﲕ‎ → ‎ىى‎ ) ARABIC LIGATURE YEH WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER ALEF MAKSURA # →‎يى‎→ +FC59 ; 0649 0649 ; MA # ( ‎ﱙ‎ → ‎ىى‎ ) ARABIC LIGATURE YEH WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER ALEF MAKSURA # →‎يى‎→ +FC96 ; 0649 0649 ; MA # ( ‎ﲖ‎ → ‎ىى‎ ) ARABIC LIGATURE YEH WITH YEH FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER ALEF MAKSURA # →‎يي‎→ +FC5A ; 0649 0649 ; MA # ( ‎ﱚ‎ → ‎ىى‎ ) ARABIC LIGATURE YEH WITH YEH ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC LETTER ALEF MAKSURA # →‎يي‎→ + +FC7A ; 0649 06DB 0649 ; MA # ( ‎ﱺ‎ → ‎ىۛى‎ ) ARABIC LIGATURE THEH WITH ALEF MAKSURA FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER ALEF MAKSURA # →‎ثى‎→ +FC13 ; 0649 06DB 0649 ; MA # ( ‎ﰓ‎ → ‎ىۛى‎ ) ARABIC LIGATURE THEH WITH ALEF MAKSURA ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER ALEF MAKSURA # →‎ثى‎→ +FC7B ; 0649 06DB 0649 ; MA # ( ‎ﱻ‎ → ‎ىۛى‎ ) ARABIC LIGATURE THEH WITH YEH FINAL FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER ALEF MAKSURA # →‎ثي‎→ +FC14 ; 0649 06DB 0649 ; MA # ( ‎ﰔ‎ → ‎ىۛى‎ ) ARABIC LIGATURE THEH WITH YEH ISOLATED FORM → ARABIC LETTER ALEF MAKSURA, ARABIC SMALL HIGH THREE DOTS, ARABIC LETTER ALEF MAKSURA # →‎ثي‎→ + +FBB1 ; 06D3 ; MA # ( ‎ﮱ‎ → ‎ۓ‎ ) ARABIC LETTER YEH BARREE WITH HAMZA ABOVE FINAL FORM → ARABIC LETTER YEH BARREE WITH HAMZA ABOVE # +FBB0 ; 06D3 ; MA # ( ‎ﮰ‎ → ‎ۓ‎ ) ARABIC LETTER YEH BARREE WITH HAMZA ABOVE ISOLATED FORM → ARABIC LETTER YEH BARREE WITH HAMZA ABOVE # + +102B8 ; 2D40 ; MA # ( 𐊸 → ⵀ ) CARIAN LETTER SS → TIFINAGH LETTER YAH # + +205E ; 2D42 ; MA #* ( ⁞ → ⵂ ) VERTICAL FOUR DOTS → TIFINAGH LETTER TUAREG YAH # +2E3D ; 2D42 ; MA #* ( ⸽ → ⵂ ) VERTICAL SIX DOTS → TIFINAGH LETTER TUAREG YAH # →⁞→ +2999 ; 2D42 ; MA #* ( ⦙ → ⵂ ) DOTTED FENCE → TIFINAGH LETTER TUAREG YAH # →⁞→ +1CEEF ; 2D42 ; MA #* ( 𜻯 → ⵂ ) GEOMANTIC FIGURE VIA → TIFINAGH LETTER TUAREG YAH # →⁞→ + +FE19 ; 2D57 ; MA #* ( ︙ → ⵗ ) PRESENTATION FORM FOR VERTICAL HORIZONTAL ELLIPSIS → TIFINAGH LETTER TUAREG YAGH # →⁝→ +205D ; 2D57 ; MA #* ( ⁝ → ⵗ ) TRICOLON → TIFINAGH LETTER TUAREG YAGH # +22EE ; 2D57 ; MA #* ( ⋮ → ⵗ ) VERTICAL ELLIPSIS → TIFINAGH LETTER TUAREG YAGH # →︙→→⁝→ + +0544 ; 1206 ; MA # ( Մ → ሆ ) ARMENIAN CAPITAL LETTER MEN → ETHIOPIC SYLLABLE HO # + +054C ; 1261 ; MA # ( Ռ → ቡ ) ARMENIAN CAPITAL LETTER RA → ETHIOPIC SYLLABLE BU # + +053B ; 12AE ; MA # ( Ի → ኮ ) ARMENIAN CAPITAL LETTER INI → ETHIOPIC SYLLABLE KO # + +054A ; 1323 ; MA # ( Պ → ጣ ) ARMENIAN CAPITAL LETTER PEH → ETHIOPIC SYLLABLE THAA # + +0972 ; 0905 0306 ; MA # ( ॲ → अ̆ ) DEVANAGARI LETTER CANDRA A → DEVANAGARI LETTER A, COMBINING BREVE # →अॅ→ + +0906 ; 0905 093E ; MA # ( आ → अा ) DEVANAGARI LETTER AA → DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN AA # + +0911 ; 0905 093E 0306 ; MA # ( ऑ → अा̆ ) DEVANAGARI LETTER CANDRA O → DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN AA, COMBINING BREVE # →अॉ→ + +0974 ; 0905 093E 093A ; MA # ( ॴ → अाऺ ) DEVANAGARI LETTER OOE → DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN AA, DEVANAGARI VOWEL SIGN OE # →अऻ→ + +0912 ; 0905 093E 0946 ; MA # ( ऒ → अाॆ ) DEVANAGARI LETTER SHORT O → DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN AA, DEVANAGARI VOWEL SIGN SHORT E # →अॊ→→आॆ→ + +0914 ; 0905 093E 0948 ; MA # ( औ → अाै ) DEVANAGARI LETTER AU → DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN AA, DEVANAGARI VOWEL SIGN AI # →अौ→→आै→ + +0913 ; 0905 093E 11B64 ; MA # ( ओ → अा𑭤 ) DEVANAGARI LETTER O → DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN AA, SHARADA VOWEL SIGN SHORT E # →अो→→आे→ + +0973 ; 0905 093A ; MA # ( ॳ → अऺ ) DEVANAGARI LETTER OE → DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN OE # + +0975 ; 0905 094F ; MA # ( ॵ → अॏ ) DEVANAGARI LETTER AW → DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN AW # + +0904 ; 0905 0946 ; MA # ( ऄ → अॆ ) DEVANAGARI LETTER SHORT A → DEVANAGARI LETTER A, DEVANAGARI VOWEL SIGN SHORT E # + +0A24 ; 0909 ; MA # ( ਤ → उ ) GURMUKHI LETTER TA → DEVANAGARI LETTER U # + +090D ; 090F 0306 ; MA # ( ऍ → ए̆ ) DEVANAGARI LETTER CANDRA E → DEVANAGARI LETTER E, COMBINING BREVE # →एॅ→ + +090E ; 090F 0946 ; MA # ( ऎ → एॆ ) DEVANAGARI LETTER SHORT E → DEVANAGARI LETTER E, DEVANAGARI VOWEL SIGN SHORT E # + +0910 ; 090F 11B64 ; MA # ( ऐ → ए𑭤 ) DEVANAGARI LETTER AI → DEVANAGARI LETTER E, SHARADA VOWEL SIGN SHORT E # →एे→ + +0A1F ; 091F ; MA # ( ਟ → ट ) GURMUKHI LETTER TTA → DEVANAGARI LETTER TTA # + +0A20 ; 0920 ; MA # ( ਠ → ठ ) GURMUKHI LETTER TTHA → DEVANAGARI LETTER TTHA # + +0A2B ; 0922 ; MA # ( ਫ → ढ ) GURMUKHI LETTER PHA → DEVANAGARI LETTER DDHA # + +0A1C ; 0924 094D 0924 ; MA # ( ਜ → त्त ) GURMUKHI LETTER JA → DEVANAGARI LETTER TA, DEVANAGARI SIGN VIRAMA, DEVANAGARI LETTER TA # + +0A27 ; 092A ; MA # ( ਧ → प ) GURMUKHI LETTER DHA → DEVANAGARI LETTER PA # + +0A72 ; 092A 094D 091F ; MA # ( ੲ → प्ट ) GURMUKHI IRI → DEVANAGARI LETTER PA, DEVANAGARI SIGN VIRAMA, DEVANAGARI LETTER TTA # + +0A07 ; 092A 094D 091F 09BF ; MA # ( ਇ → प्टি ) GURMUKHI LETTER I → DEVANAGARI LETTER PA, DEVANAGARI SIGN VIRAMA, DEVANAGARI LETTER TTA, BENGALI VOWEL SIGN I # →ੲਿ→ + +0A08 ; 092A 094D 091F 0A40 ; MA # ( ਈ → प्टੀ ) GURMUKHI LETTER II → DEVANAGARI LETTER PA, DEVANAGARI SIGN VIRAMA, DEVANAGARI LETTER TTA, GURMUKHI VOWEL SIGN II # →ੲੀ→ + +0A0F ; 092A 094D 091F 11B64 ; MA # ( ਏ → प्ट𑭤 ) GURMUKHI LETTER EE → DEVANAGARI LETTER PA, DEVANAGARI SIGN VIRAMA, DEVANAGARI LETTER TTA, SHARADA VOWEL SIGN SHORT E # →ੲੇ→ + +0A2E ; 092D ; MA # ( ਮ → भ ) GURMUKHI LETTER MA → DEVANAGARI LETTER BHA # + +0A38 ; 092E ; MA # ( ਸ → म ) GURMUKHI LETTER SA → DEVANAGARI LETTER MA # + +0908 ; 0930 094D 0907 ; MA # ( ई → र्इ ) DEVANAGARI LETTER II → DEVANAGARI LETTER RA, DEVANAGARI SIGN VIRAMA, DEVANAGARI LETTER I # + +0A15 ; 0935 ; MA # ( ਕ → व ) GURMUKHI LETTER KA → DEVANAGARI LETTER VA # + +0A35 ; 0939 ; MA # ( ਵ → ह ) GURMUKHI LETTER VA → DEVANAGARI LETTER HA # + +0ABD ; 093D ; MA # ( ઽ → ऽ ) GUJARATI SIGN AVAGRAHA → DEVANAGARI SIGN AVAGRAHA # + +111DC ; A8FB ; MA # ( 𑇜 → ꣻ ) SHARADA HEADSTROKE → DEVANAGARI HEADSTROKE # + +0949 ; 093E 0306 ; MA # ( ॉ → ा̆ ) DEVANAGARI VOWEL SIGN CANDRA O → DEVANAGARI VOWEL SIGN AA, COMBINING BREVE # →ाॅ→ + +093B ; 093E 093A ; MA # ( ऻ → ाऺ ) DEVANAGARI VOWEL SIGN OOE → DEVANAGARI VOWEL SIGN AA, DEVANAGARI VOWEL SIGN OE # + +111CB ; 093A ; MA # ( 𑇋 → ऺ ) SHARADA VOWEL MODIFIER MARK → DEVANAGARI VOWEL SIGN OE # +11B60 ; 093A ; MA # ( 𑭠 → ऺ ) SHARADA VOWEL SIGN OE → DEVANAGARI VOWEL SIGN OE # + +0AC1 ; 0941 ; MA # ( ુ → ु ) GUJARATI VOWEL SIGN U → DEVANAGARI VOWEL SIGN U # + +0AC2 ; 0942 ; MA # ( ૂ → ू ) GUJARATI VOWEL SIGN UU → DEVANAGARI VOWEL SIGN UU # + +0A4B ; 0946 ; MA # ( ੋ → ॆ ) GURMUKHI VOWEL SIGN OO → DEVANAGARI VOWEL SIGN SHORT E # + +0A48 ; 0948 ; MA # ( ੈ → ै ) GURMUKHI VOWEL SIGN AI → DEVANAGARI VOWEL SIGN AI # + +0A4D ; 094D ; MA # ( ੍ → ् ) GURMUKHI SIGN VIRAMA → DEVANAGARI SIGN VIRAMA # +0ACD ; 094D ; MA # ( ્ → ् ) GUJARATI SIGN VIRAMA → DEVANAGARI SIGN VIRAMA # + +0986 ; 0985 09BE ; MA # ( আ → অা ) BENGALI LETTER AA → BENGALI LETTER A, BENGALI VOWEL SIGN AA # + +09E0 ; 098B 09C3 ; MA # ( ৠ → ঋৃ ) BENGALI LETTER VOCALIC RR → BENGALI LETTER VOCALIC R, BENGALI VOWEL SIGN VOCALIC R # +09E1 ; 098B 09C3 ; MA # ( ৡ → ঋৃ ) BENGALI LETTER VOCALIC LL → BENGALI LETTER VOCALIC R, BENGALI VOWEL SIGN VOCALIC R # →ঌৢ→→ৠ→ + +11492 ; 0998 ; MA # ( 𑒒 → ঘ ) TIRHUTA LETTER GHA → BENGALI LETTER GHA # + +11494 ; 099A ; MA # ( 𑒔 → চ ) TIRHUTA LETTER CA → BENGALI LETTER CA # + +11496 ; 099C ; MA # ( 𑒖 → জ ) TIRHUTA LETTER JA → BENGALI LETTER JA # + +11498 ; 099E ; MA # ( 𑒘 → ঞ ) TIRHUTA LETTER NYA → BENGALI LETTER NYA # + +11499 ; 099F ; MA # ( 𑒙 → ট ) TIRHUTA LETTER TTA → BENGALI LETTER TTA # + +1149B ; 09A1 ; MA # ( 𑒛 → ড ) TIRHUTA LETTER DDA → BENGALI LETTER DDA # + +114AA ; 09A3 ; MA # ( 𑒪 → ণ ) TIRHUTA LETTER LA → BENGALI LETTER NNA # + +1149E ; 09A4 ; MA # ( 𑒞 → ত ) TIRHUTA LETTER TA → BENGALI LETTER TA # + +1149F ; 09A5 ; MA # ( 𑒟 → থ ) TIRHUTA LETTER THA → BENGALI LETTER THA # + +114A0 ; 09A6 ; MA # ( 𑒠 → দ ) TIRHUTA LETTER DA → BENGALI LETTER DA # + +114A1 ; 09A7 ; MA # ( 𑒡 → ধ ) TIRHUTA LETTER DHA → BENGALI LETTER DHA # + +114A2 ; 09A8 ; MA # ( 𑒢 → ন ) TIRHUTA LETTER NA → BENGALI LETTER NA # + +114A3 ; 09AA ; MA # ( 𑒣 → প ) TIRHUTA LETTER PA → BENGALI LETTER PA # + +114A9 ; 09AC ; MA # ( 𑒩 → ব ) TIRHUTA LETTER RA → BENGALI LETTER BA # + +114A7 ; 09AE ; MA # ( 𑒧 → ম ) TIRHUTA LETTER MA → BENGALI LETTER MA # + +114A8 ; 09AF ; MA # ( 𑒨 → য ) TIRHUTA LETTER YA → BENGALI LETTER YA # + +09F0 ; 09B0 ; MA # ( ৰ → র ) BENGALI LETTER RA WITH MIDDLE DIAGONAL → BENGALI LETTER RA # +114AB ; 09B0 ; MA # ( 𑒫 → র ) TIRHUTA LETTER VA → BENGALI LETTER RA # + +1149D ; 09B2 ; MA # ( 𑒝 → ল ) TIRHUTA LETTER NNA → BENGALI LETTER LA # + +114AD ; 09B7 ; MA # ( 𑒭 → ষ ) TIRHUTA LETTER SSA → BENGALI LETTER SSA # + +114AE ; 09B8 ; MA # ( 𑒮 → স ) TIRHUTA LETTER SA → BENGALI LETTER SA # + +114C4 ; 09BD ; MA # ( 𑓄 → ঽ ) TIRHUTA SIGN AVAGRAHA → BENGALI SIGN AVAGRAHA # + +114B0 ; 09BE ; MA # ( 𑒰 → া ) TIRHUTA VOWEL SIGN AA → BENGALI VOWEL SIGN AA # + +093F ; 09BF ; MA # ( ि → ি ) DEVANAGARI VOWEL SIGN I → BENGALI VOWEL SIGN I # +0A3F ; 09BF ; MA # ( ਿ → ি ) GURMUKHI VOWEL SIGN I → BENGALI VOWEL SIGN I # →ि→ +114B1 ; 09BF ; MA # ( 𑒱 → ি ) TIRHUTA VOWEL SIGN I → BENGALI VOWEL SIGN I # + +114B9 ; 09C7 ; MA # ( 𑒹 → ে ) TIRHUTA VOWEL SIGN E → BENGALI VOWEL SIGN E # + +114BC ; 09CB ; MA # ( 𑒼 → ো ) TIRHUTA VOWEL SIGN O → BENGALI VOWEL SIGN O # + +114BE ; 09CC ; MA # ( 𑒾 → ৌ ) TIRHUTA VOWEL SIGN AU → BENGALI VOWEL SIGN AU # + +114C2 ; 09CD ; MA # ( 𑓂 → ্ ) TIRHUTA SIGN VIRAMA → BENGALI SIGN VIRAMA # + +114BD ; 09D7 ; MA # ( 𑒽 → ৗ ) TIRHUTA VOWEL SIGN SHORT O → BENGALI AU LENGTH MARK # + +0A09 ; 0A73 11B62 ; MA # ( ਉ → ੳ𑭢 ) GURMUKHI LETTER U → GURMUKHI URA, SHARADA VOWEL SIGN UE # →ੳੁ→ + +0A0A ; 0A73 11B63 ; MA # ( ਊ → ੳ𑭣 ) GURMUKHI LETTER UU → GURMUKHI URA, SHARADA VOWEL SIGN UUE # →ੳੂ→ + +0A10 ; 0A05 0948 ; MA # ( ਐ → ਅै ) GURMUKHI LETTER AI → GURMUKHI LETTER A, DEVANAGARI VOWEL SIGN AI # →ਅੈ→ + +0A06 ; 0A05 0A3E ; MA # ( ਆ → ਅਾ ) GURMUKHI LETTER AA → GURMUKHI LETTER A, GURMUKHI VOWEL SIGN AA # + +0A14 ; 0A05 0A4C ; MA # ( ਔ → ਅੌ ) GURMUKHI LETTER AU → GURMUKHI LETTER A, GURMUKHI VOWEL SIGN AU # + +0A86 ; 0A85 0ABE ; MA # ( આ → અા ) GUJARATI LETTER AA → GUJARATI LETTER A, GUJARATI VOWEL SIGN AA # + +0A91 ; 0A85 0ABE 0AC5 ; MA # ( ઑ → અાૅ ) GUJARATI VOWEL CANDRA O → GUJARATI LETTER A, GUJARATI VOWEL SIGN AA, GUJARATI VOWEL SIGN CANDRA E # →અૉ→→આૅ→ + +0A93 ; 0A85 0ABE 0AC7 ; MA # ( ઓ → અાે ) GUJARATI LETTER O → GUJARATI LETTER A, GUJARATI VOWEL SIGN AA, GUJARATI VOWEL SIGN E # →અો→→આે→ + +0A94 ; 0A85 0ABE 0AC8 ; MA # ( ઔ → અાૈ ) GUJARATI LETTER AU → GUJARATI LETTER A, GUJARATI VOWEL SIGN AA, GUJARATI VOWEL SIGN AI # →અૌ→→આૈ→ + +0A8D ; 0A85 0AC5 ; MA # ( ઍ → અૅ ) GUJARATI VOWEL CANDRA E → GUJARATI LETTER A, GUJARATI VOWEL SIGN CANDRA E # + +0A8F ; 0A85 0AC7 ; MA # ( એ → અે ) GUJARATI LETTER E → GUJARATI LETTER A, GUJARATI VOWEL SIGN E # + +0A90 ; 0A85 0AC8 ; MA # ( ઐ → અૈ ) GUJARATI LETTER AI → GUJARATI LETTER A, GUJARATI VOWEL SIGN AI # + +0B06 ; 0B05 0B3E ; MA # ( ଆ → ଅା ) ORIYA LETTER AA → ORIYA LETTER A, ORIYA VOWEL SIGN AA # + +1031 ; 0B47 ; MA # ( ေ → େ ) MYANMAR VOWEL SIGN E → ORIYA VOWEL SIGN E # + +0BEE ; 0B85 ; MA # ( ௮ → அ ) TAMIL DIGIT EIGHT → TAMIL LETTER A # + +0BB0 ; 0B88 ; MA # ( ர → ஈ ) TAMIL LETTER RA → TAMIL LETTER II # →ா→ +0BBE ; 0B88 ; MA # ( ா → ஈ ) TAMIL VOWEL SIGN AA → TAMIL LETTER II # + +0BEB ; 0B88 0BC1 ; MA # ( ௫ → ஈு ) TAMIL DIGIT FIVE → TAMIL LETTER II, TAMIL VOWEL SIGN U # →ரு→ + +0BE8 ; 0B89 ; MA # ( ௨ → உ ) TAMIL DIGIT TWO → TAMIL LETTER U # +0D09 ; 0B89 ; MA # ( ഉ → உ ) MALAYALAM LETTER U → TAMIL LETTER U # + +0B8A ; 0B89 0BB3 ; MA # ( ஊ → உள ) TAMIL LETTER UU → TAMIL LETTER U, TAMIL LETTER LLA # + +0D0A ; 0B89 0D57 ; MA # ( ഊ → உൗ ) MALAYALAM LETTER UU → TAMIL LETTER U, MALAYALAM AU LENGTH MARK # →ഉൗ→ + +0BED ; 0B8E ; MA # ( ௭ → எ ) TAMIL DIGIT SEVEN → TAMIL LETTER E # + +0BF7 ; 0B8E 0BB5 ; MA #* ( ௷ → எவ ) TAMIL CREDIT SIGN → TAMIL LETTER E, TAMIL LETTER VA # + +0B9C ; 0B90 ; MA # ( ஜ → ஐ ) TAMIL LETTER JA → TAMIL LETTER AI # +0D1C ; 0B90 ; MA # ( ജ → ஐ ) MALAYALAM LETTER JA → TAMIL LETTER AI # →ஜ→ + +0B94 ; 0B92 0BB3 ; MA # ( ஔ → ஒள ) TAMIL LETTER AU → TAMIL LETTER O, TAMIL LETTER LLA # + +0BE7 ; 0B95 ; MA # ( ௧ → க ) TAMIL DIGIT ONE → TAMIL LETTER KA # + +0BEA ; 0B9A ; MA # ( ௪ → ச ) TAMIL DIGIT FOUR → TAMIL LETTER CA # + +0BEC ; 0B9A 0BC1 ; MA # ( ௬ → சு ) TAMIL DIGIT SIX → TAMIL LETTER CA, TAMIL VOWEL SIGN U # + +0BF2 ; 0B9A 0BC2 ; MA #* ( ௲ → சூ ) TAMIL NUMBER ONE THOUSAND → TAMIL LETTER CA, TAMIL VOWEL SIGN UU # + +0D3A ; 0B9F 0BBF ; MA # ( ഺ → டி ) MALAYALAM LETTER TTTA → TAMIL LETTER TTA, TAMIL VOWEL SIGN I # + +0D23 ; 0BA3 ; MA # ( ണ → ண ) MALAYALAM LETTER NNA → TAMIL LETTER NNA # + +0D7A ; 0BA3 0D4D ; MA # ( ൺ → ண് ) MALAYALAM LETTER CHILLU NN → TAMIL LETTER NNA, MALAYALAM SIGN VIRAMA # →ണ്→ + +0BFA ; 0BA8 0BC0 ; MA #* ( ௺ → நீ ) TAMIL NUMBER SIGN → TAMIL LETTER NA, TAMIL VOWEL SIGN II # + +0D25 ; 0BAE ; MA # ( ഥ → ம ) MALAYALAM LETTER THA → TAMIL LETTER MA # + +0BF4 ; 0BAE 0BC0 ; MA #* ( ௴ → மீ ) TAMIL MONTH SIGN → TAMIL LETTER MA, TAMIL VOWEL SIGN II # + +0BF0 ; 0BAF ; MA #* ( ௰ → ய ) TAMIL NUMBER TEN → TAMIL LETTER YA # + +0D16 ; 0BB5 ; MA # ( ഖ → வ ) MALAYALAM LETTER KHA → TAMIL LETTER VA # + +0D34 ; 0BB4 ; MA # ( ഴ → ழ ) MALAYALAM LETTER LLLA → TAMIL LETTER LLLA # + +0BD7 ; 0BB3 ; MA # ( ௗ → ள ) TAMIL AU LENGTH MARK → TAMIL LETTER LLA # + +0BC8 ; 0BA9 ; MA # ( ை → ன ) TAMIL VOWEL SIGN AI → TAMIL LETTER NNNA # + +0BB8 ; 0BB6 ; MA # ( ஸ → ஶ ) TAMIL LETTER SA → TAMIL LETTER SHA # +0D36 ; 0BB6 ; MA # ( ശ → ஶ ) MALAYALAM LETTER SHA → TAMIL LETTER SHA # + +0BF8 ; 0BB7 ; MA #* ( ௸ → ஷ ) TAMIL AS ABOVE SIGN → TAMIL LETTER SSA # + +0D3F ; 0BBF ; MA # ( ി → ி ) MALAYALAM VOWEL SIGN I → TAMIL VOWEL SIGN I # +0D40 ; 0BBF ; MA # ( ീ → ி ) MALAYALAM VOWEL SIGN II → TAMIL VOWEL SIGN I # + +0D46 ; 0BC6 ; MA # ( െ → ெ ) MALAYALAM VOWEL SIGN E → TAMIL VOWEL SIGN E # + +0BCA ; 0BC6 0B88 ; MA # ( ொ → ெஈ ) TAMIL VOWEL SIGN O → TAMIL VOWEL SIGN E, TAMIL LETTER II # →ெர→ + +0BCC ; 0BC6 0BB3 ; MA # ( ௌ → ெள ) TAMIL VOWEL SIGN AU → TAMIL VOWEL SIGN E, TAMIL LETTER LLA # + +0D48 ; 0BC6 0BC6 ; MA # ( ൈ → ெெ ) MALAYALAM VOWEL SIGN AI → TAMIL VOWEL SIGN E, TAMIL VOWEL SIGN E # →െെ→ + +0D10 ; 0BC6 0D0E ; MA # ( ഐ → ெഎ ) MALAYALAM LETTER AI → TAMIL VOWEL SIGN E, MALAYALAM LETTER E # →െഎ→ + +0D47 ; 0BC7 ; MA # ( േ → ே ) MALAYALAM VOWEL SIGN EE → TAMIL VOWEL SIGN EE # + +0BCB ; 0BC7 0B88 ; MA # ( ோ → ேஈ ) TAMIL VOWEL SIGN OO → TAMIL VOWEL SIGN EE, TAMIL LETTER II # →ேர→ + +0C85 ; 0C05 ; MA # ( ಅ → అ ) KANNADA LETTER A → TELUGU LETTER A # + +0C86 ; 0C06 ; MA # ( ಆ → ఆ ) KANNADA LETTER AA → TELUGU LETTER AA # + +0C87 ; 0C07 ; MA # ( ಇ → ఇ ) KANNADA LETTER I → TELUGU LETTER I # + +0C60 ; 0C0B 0C3E ; MA # ( ౠ → ఋా ) TELUGU LETTER VOCALIC RR → TELUGU LETTER VOCALIC R, TELUGU VOWEL SIGN AA # + +0C61 ; 0C0C 0C3E ; MA # ( ౡ → ఌా ) TELUGU LETTER VOCALIC LL → TELUGU LETTER VOCALIC L, TELUGU VOWEL SIGN AA # + +0C90 ; 0C10 ; MA # ( ಐ → ఐ ) KANNADA LETTER AI → TELUGU LETTER AI # + +0C92 ; 0C12 ; MA # ( ಒ → ఒ ) KANNADA LETTER O → TELUGU LETTER O # + +0C14 ; 0C12 0C4C ; MA # ( ఔ → ఒౌ ) TELUGU LETTER AU → TELUGU LETTER O, TELUGU VOWEL SIGN AU # +0C94 ; 0C12 0C4C ; MA # ( ಔ → ఒౌ ) KANNADA LETTER AU → TELUGU LETTER O, TELUGU VOWEL SIGN AU # →ఔ→ + +0C13 ; 0C12 0C55 ; MA # ( ఓ → ఒౕ ) TELUGU LETTER OO → TELUGU LETTER O, TELUGU LENGTH MARK # +0C93 ; 0C12 0C55 ; MA # ( ಓ → ఒౕ ) KANNADA LETTER OO → TELUGU LETTER O, TELUGU LENGTH MARK # →ఓ→ + +0C97 ; 0C17 ; MA # ( ಗ → గ ) KANNADA LETTER GA → TELUGU LETTER GA # + +0C9C ; 0C1C ; MA # ( ಜ → జ ) KANNADA LETTER JA → TELUGU LETTER JA # + +0C9D ; 0C1D ; MA # ( ಝ → ఝ ) KANNADA LETTER JHA → TELUGU LETTER JHA # + +0C9E ; 0C1E ; MA # ( ಞ → ఞ ) KANNADA LETTER NYA → TELUGU LETTER NYA # + +0C9F ; 0C1F ; MA # ( ಟ → ట ) KANNADA LETTER TTA → TELUGU LETTER TTA # + +0C22 ; 0C21 0323 ; MA # ( ఢ → డ̣ ) TELUGU LETTER DDHA → TELUGU LETTER DDA, COMBINING DOT BELOW # + +0CA3 ; 0C23 ; MA # ( ಣ → ణ ) KANNADA LETTER NNA → TELUGU LETTER NNA # + +0CA6 ; 0C26 ; MA # ( ದ → ద ) KANNADA LETTER DA → TELUGU LETTER DA # + +0C25 ; 0C27 05BC ; MA # ( థ → ధּ ) TELUGU LETTER THA → TELUGU LETTER DHA, HEBREW POINT DAGESH OR MAPIQ # + +0CA8 ; 0C28 ; MA # ( ನ → న ) KANNADA LETTER NA → TELUGU LETTER NA # + +0C2D ; 0C2C 0323 ; MA # ( భ → బ̣ ) TELUGU LETTER BHA → TELUGU LETTER BA, COMBINING DOT BELOW # + +0CAF ; 0C2F ; MA # ( ಯ → య ) KANNADA LETTER YA → TELUGU LETTER YA # + +0CB0 ; 0C30 ; MA # ( ರ → ర ) KANNADA LETTER RA → TELUGU LETTER RA # + +0C20 ; 0C30 05BC ; MA # ( ఠ → రּ ) TELUGU LETTER TTHA → TELUGU LETTER RA, HEBREW POINT DAGESH OR MAPIQ # + +0CB1 ; 0C31 ; MA # ( ಱ → ఱ ) KANNADA LETTER RRA → TELUGU LETTER RRA # + +0CB2 ; 0C32 ; MA # ( ಲ → ల ) KANNADA LETTER LA → TELUGU LETTER LA # + +0C37 ; 0C35 0323 ; MA # ( ష → వ̣ ) TELUGU LETTER SSA → TELUGU LETTER VA, COMBINING DOT BELOW # + +0C39 ; 0C35 0C3E ; MA # ( హ → వా ) TELUGU LETTER HA → TELUGU LETTER VA, TELUGU VOWEL SIGN AA # + +0C2E ; 0C35 0C41 ; MA # ( మ → వు ) TELUGU LETTER MA → TELUGU LETTER VA, TELUGU VOWEL SIGN U # + +0CB3 ; 0C33 ; MA # ( ಳ → ళ ) KANNADA LETTER LLA → TELUGU LETTER LLA # + +0CBF ; 0C3F ; MA # ( ಿ → ి ) KANNADA VOWEL SIGN I → TELUGU VOWEL SIGN I # + +0CC1 ; 0C41 ; MA # ( ು → ు ) KANNADA VOWEL SIGN U → TELUGU VOWEL SIGN U # + +0C42 ; 0C41 0C3E ; MA # ( ూ → ుా ) TELUGU VOWEL SIGN UU → TELUGU VOWEL SIGN U, TELUGU VOWEL SIGN AA # + +0CC3 ; 0C43 ; MA # ( ೃ → ృ ) KANNADA VOWEL SIGN VOCALIC R → TELUGU VOWEL SIGN VOCALIC R # + +0C44 ; 0C43 0C3E ; MA # ( ౄ → ృా ) TELUGU VOWEL SIGN VOCALIC RR → TELUGU VOWEL SIGN VOCALIC R, TELUGU VOWEL SIGN AA # + +0CE1 ; 0C8C 0CBE ; MA # ( ೡ → ಌಾ ) KANNADA LETTER VOCALIC LL → KANNADA LETTER VOCALIC L, KANNADA VOWEL SIGN AA # + +0C16 ; 0C96 0323 ; MA # ( ఖ → ಖ̣ ) TELUGU LETTER KHA → KANNADA LETTER KHA, COMBINING DOT BELOW # + +0D08 ; 0D07 0D57 ; MA # ( ഈ → ഇൗ ) MALAYALAM LETTER II → MALAYALAM LETTER I, MALAYALAM AU LENGTH MARK # + +0D13 ; 0D12 0D3E ; MA # ( ഓ → ഒാ ) MALAYALAM LETTER OO → MALAYALAM LETTER O, MALAYALAM VOWEL SIGN AA # + +0D14 ; 0D12 0D57 ; MA # ( ഔ → ഒൗ ) MALAYALAM LETTER AU → MALAYALAM LETTER O, MALAYALAM AU LENGTH MARK # + +0D61 ; 0D1E ; MA # ( ൡ → ഞ ) MALAYALAM LETTER VOCALIC LL → MALAYALAM LETTER NYA # + +0D6B ; 0D26 0D4D 0D30 ; MA # ( ൫ → ദ്ര ) MALAYALAM DIGIT FIVE → MALAYALAM LETTER DA, MALAYALAM SIGN VIRAMA, MALAYALAM LETTER RA # + +0D79 ; 0D28 0D41 ; MA #* ( ൹ → നു ) MALAYALAM DATE MARK → MALAYALAM LETTER NA, MALAYALAM VOWEL SIGN U # +0D0C ; 0D28 0D41 ; MA # ( ഌ → നു ) MALAYALAM LETTER VOCALIC L → MALAYALAM LETTER NA, MALAYALAM VOWEL SIGN U # +0D19 ; 0D28 0D41 ; MA # ( ങ → നു ) MALAYALAM LETTER NGA → MALAYALAM LETTER NA, MALAYALAM VOWEL SIGN U # →ഌ→ + +0D6F ; 0D28 0D4D ; MA # ( ൯ → ന് ) MALAYALAM DIGIT NINE → MALAYALAM LETTER NA, MALAYALAM SIGN VIRAMA # +0D7B ; 0D28 0D4D ; MA # ( ൻ → ന് ) MALAYALAM LETTER CHILLU N → MALAYALAM LETTER NA, MALAYALAM SIGN VIRAMA # →൯→ + +0D6C ; 0D28 0D4D 0D28 ; MA # ( ൬ → ന്ന ) MALAYALAM DIGIT SIX → MALAYALAM LETTER NA, MALAYALAM SIGN VIRAMA, MALAYALAM LETTER NA # + +0D5A ; 0D28 0D4D 0D2E ; MA #* ( ൚ → ന്മ ) MALAYALAM FRACTION THREE EIGHTIETHS → MALAYALAM LETTER NA, MALAYALAM SIGN VIRAMA, MALAYALAM LETTER MA # + +10D8 ; 0D30 ; MA # ( ი → ര ) GEORGIAN LETTER IN → MALAYALAM LETTER RA # →റ→ +0D31 ; 0D30 ; MA # ( റ → ര ) MALAYALAM LETTER RRA → MALAYALAM LETTER RA # +1002 ; 0D30 ; MA # ( ဂ → ര ) MYANMAR LETTER GA → MALAYALAM LETTER RA # →റ→ + +0D6A ; 0D30 0D4D ; MA # ( ൪ → ര് ) MALAYALAM DIGIT FOUR → MALAYALAM LETTER RA, MALAYALAM SIGN VIRAMA # +0D7C ; 0D30 0D4D ; MA # ( ർ → ര് ) MALAYALAM LETTER CHILLU RR → MALAYALAM LETTER RA, MALAYALAM SIGN VIRAMA # →൪→ + +1081 ; 0D30 103E ; MA # ( ႁ → രှ ) MYANMAR LETTER SHAN HA → MALAYALAM LETTER RA, MYANMAR CONSONANT SIGN MEDIAL HA # →ဂှ→ + +1000 ; 0D30 102C ; MA # ( က → രာ ) MYANMAR LETTER KA → MALAYALAM LETTER RA, MYANMAR VOWEL SIGN AA # →ဂာ→ + +1023 ; 0D30 102C 1039 0D30 102C ; MA # ( ဣ → രာ္രာ ) MYANMAR LETTER I → MALAYALAM LETTER RA, MYANMAR VOWEL SIGN AA, MYANMAR SIGN VIRAMA, MALAYALAM LETTER RA, MYANMAR VOWEL SIGN AA # →က္က→ + +0D7D ; 0D32 0D4D ; MA # ( ൽ → ല് ) MALAYALAM LETTER CHILLU L → MALAYALAM LETTER LA, MALAYALAM SIGN VIRAMA # + +0D6E ; 0D35 0D4D 0D30 ; MA # ( ൮ → വ്ര ) MALAYALAM DIGIT EIGHT → MALAYALAM LETTER VA, MALAYALAM SIGN VIRAMA, MALAYALAM LETTER RA # + +0D76 ; 0D39 0D4D 0D2E ; MA #* ( ൶ → ഹ്മ ) MALAYALAM FRACTION ONE SIXTEENTH → MALAYALAM LETTER HA, MALAYALAM SIGN VIRAMA, MALAYALAM LETTER MA # + +0D7E ; 0D33 0D4D ; MA # ( ൾ → ള് ) MALAYALAM LETTER CHILLU LL → MALAYALAM LETTER LLA, MALAYALAM SIGN VIRAMA # + +0D42 ; 0D41 ; MA # ( ൂ → ു ) MALAYALAM VOWEL SIGN UU → MALAYALAM VOWEL SIGN U # +0D43 ; 0D41 ; MA # ( ൃ → ു ) MALAYALAM VOWEL SIGN VOCALIC R → MALAYALAM VOWEL SIGN U # →ൂ→ + +0DB5 ; 0D91 ; MA # ( ඵ → එ ) SINHALA LETTER MAHAAPRAANA PAYANNA → SINHALA LETTER EYANNA # + +0D93 ; 0D91 0DD9 ; MA # ( ඓ → එෙ ) SINHALA LETTER AIYANNA → SINHALA LETTER EYANNA, SINHALA VOWEL SIGN KOMBUVA # + +0D92 ; 0D91 0DCA ; MA # ( ඒ → එ් ) SINHALA LETTER EEYANNA → SINHALA LETTER EYANNA, SINHALA SIGN AL-LAKUNA # + +0DB9 ; 0D94 ; MA # ( ඹ → ඔ ) SINHALA LETTER AMBA BAYANNA → SINHALA LETTER OYANNA # + +0DB6 ; 0D9B ; MA # ( බ → ඛ ) SINHALA LETTER ALPAPRAANA BAYANNA → SINHALA LETTER MAHAAPRAANA KAYANNA # + +0DC0 ; 0DA0 ; MA # ( ව → ච ) SINHALA LETTER VAYANNA → SINHALA LETTER ALPAPRAANA CAYANNA # + +0DEA ; 0DA2 ; MA # ( ෪ → ජ ) SINHALA LITH DIGIT FOUR → SINHALA LETTER ALPAPRAANA JAYANNA # + +0DEB ; 0DAF ; MA # ( ෫ → ද ) SINHALA LITH DIGIT FIVE → SINHALA LETTER ALPAPRAANA DAYANNA # + +0DC4 ; 0DB7 ; MA # ( හ → භ ) SINHALA LETTER HAYANNA → SINHALA LETTER MAHAAPRAANA BAYANNA # + +0D8D ; 0DC3 0DD8 ; MA # ( ඍ → සෘ ) SINHALA LETTER IRUYANNA → SINHALA LETTER DANTAJA SAYANNA, SINHALA VOWEL SIGN GAETTA-PILLA # + +11413 ; 11434 11442 11412 ; MA # ( 𑐓 → 𑐴𑑂𑐒 ) NEWA LETTER NGHA → NEWA LETTER HA, NEWA SIGN VIRAMA, NEWA LETTER NGA # + +11419 ; 11434 11442 11418 ; MA # ( 𑐙 → 𑐴𑑂𑐘 ) NEWA LETTER NYHA → NEWA LETTER HA, NEWA SIGN VIRAMA, NEWA LETTER NYA # + +11424 ; 11434 11442 11423 ; MA # ( 𑐤 → 𑐴𑑂𑐣 ) NEWA LETTER NHA → NEWA LETTER HA, NEWA SIGN VIRAMA, NEWA LETTER NA # + +1142A ; 11434 11442 11429 ; MA # ( 𑐪 → 𑐴𑑂𑐩 ) NEWA LETTER MHA → NEWA LETTER HA, NEWA SIGN VIRAMA, NEWA LETTER MA # + +1142D ; 11434 11442 1142C ; MA # ( 𑐭 → 𑐴𑑂𑐬 ) NEWA LETTER RHA → NEWA LETTER HA, NEWA SIGN VIRAMA, NEWA LETTER RA # + +1142F ; 11434 11442 1142E ; MA # ( 𑐯 → 𑐴𑑂𑐮 ) NEWA LETTER LHA → NEWA LETTER HA, NEWA SIGN VIRAMA, NEWA LETTER LA # + +115D8 ; 11582 ; MA # ( 𑗘 → 𑖂 ) SIDDHAM LETTER THREE-CIRCLE ALTERNATE I → SIDDHAM LETTER I # +115D9 ; 11582 ; MA # ( 𑗙 → 𑖂 ) SIDDHAM LETTER TWO-CIRCLE ALTERNATE I → SIDDHAM LETTER I # + +115DA ; 11583 ; MA # ( 𑗚 → 𑖃 ) SIDDHAM LETTER TWO-CIRCLE ALTERNATE II → SIDDHAM LETTER II # + +115DB ; 11584 ; MA # ( 𑗛 → 𑖄 ) SIDDHAM LETTER ALTERNATE U → SIDDHAM LETTER U # + +115DC ; 115B2 ; MA # ( 𑗜 → 𑖲 ) SIDDHAM VOWEL SIGN ALTERNATE U → SIDDHAM VOWEL SIGN U # + +115DD ; 115B3 ; MA # ( 𑗝 → 𑖳 ) SIDDHAM VOWEL SIGN ALTERNATE UU → SIDDHAM VOWEL SIGN UU # + +0E03 ; 0E02 ; MA # ( ฃ → ข ) THAI CHARACTER KHO KHUAT → THAI CHARACTER KHO KHAI # + +0E14 ; 0E04 ; MA # ( ด → ค ) THAI CHARACTER DO DEK → THAI CHARACTER KHO KHWAI # +0E15 ; 0E04 ; MA # ( ต → ค ) THAI CHARACTER TO TAO → THAI CHARACTER KHO KHWAI # →ด→ + +0E21 ; 0E06 ; MA # ( ม → ฆ ) THAI CHARACTER MO MA → THAI CHARACTER KHO RAKHANG # + +0E88 ; 0E08 ; MA # ( ຈ → จ ) LAO LETTER CO → THAI CHARACTER CHO CHAN # + +0E0B ; 0E0A ; MA # ( ซ → ช ) THAI CHARACTER SO SO → THAI CHARACTER CHO CHANG # + +0E0F ; 0E0E ; MA # ( ฏ → ฎ ) THAI CHARACTER TO PATAK → THAI CHARACTER DO CHADA # + +0E17 ; 0E11 ; MA # ( ท → ฑ ) THAI CHARACTER THO THAHAN → THAI CHARACTER THO NANGMONTHO # + +0E9A ; 0E1A ; MA # ( ບ → บ ) LAO LETTER BO → THAI CHARACTER BO BAIMAI # + +0E9B ; 0E1B ; MA # ( ປ → ป ) LAO LETTER PO → THAI CHARACTER PO PLA # + +0E9D ; 0E1D ; MA # ( ຝ → ฝ ) LAO LETTER FO TAM → THAI CHARACTER FO FA # + +0E9E ; 0E1E ; MA # ( ພ → พ ) LAO LETTER PHO TAM → THAI CHARACTER PHO PHAN # + +0E9F ; 0E1F ; MA # ( ຟ → ฟ ) LAO LETTER FO SUNG → THAI CHARACTER FO FAN # + +0E26 ; 0E20 ; MA # ( ฦ → ภ ) THAI CHARACTER LU → THAI CHARACTER PHO SAMPHAO # + +0E8D ; 0E22 ; MA # ( ຍ → ย ) LAO LETTER NYO → THAI CHARACTER YO YAK # + +17D4 ; 0E2F ; MA #* ( ។ → ฯ ) KHMER SIGN KHAN → THAI CHARACTER PAIYANNOI # + +0E45 ; 0E32 ; MA # ( ๅ → า ) THAI CHARACTER LAKKHANGYAO → THAI CHARACTER SARA AA # + +0E33 ; 030A 0E32 ; MA # ( ำ → ̊า ) THAI CHARACTER SARA AM → COMBINING RING ABOVE, THAI CHARACTER SARA AA # →ํา→ + +17B7 ; 0E34 ; MA # ( ិ → ิ ) KHMER VOWEL SIGN I → THAI CHARACTER SARA I # + +17B8 ; 0E35 ; MA # ( ី → ี ) KHMER VOWEL SIGN II → THAI CHARACTER SARA II # + +17B9 ; 0E36 ; MA # ( ឹ → ึ ) KHMER VOWEL SIGN Y → THAI CHARACTER SARA UE # + +17BA ; 0E37 ; MA # ( ឺ → ื ) KHMER VOWEL SIGN YY → THAI CHARACTER SARA UEE # + +0EB8 ; 0E38 ; MA # ( ຸ → ุ ) LAO VOWEL SIGN U → THAI CHARACTER SARA U # + +0EB9 ; 0E39 ; MA # ( ູ → ู ) LAO VOWEL SIGN UU → THAI CHARACTER SARA UU # + +0E41 ; 0E40 0E40 ; MA # ( แ → เเ ) THAI CHARACTER SARA AE → THAI CHARACTER SARA E, THAI CHARACTER SARA E # + +0EDC ; 0EAB 0E99 ; MA # ( ໜ → ຫນ ) LAO HO NO → LAO LETTER HO SUNG, LAO LETTER NO # + +0EDD ; 0EAB 0EA1 ; MA # ( ໝ → ຫມ ) LAO HO MO → LAO LETTER HO SUNG, LAO LETTER MO # + +0EB3 ; 030A 0EB2 ; MA # ( ຳ → ̊າ ) LAO VOWEL SIGN AM → COMBINING RING ABOVE, LAO VOWEL SIGN AA # →ໍາ→ + +0F02 ; 0F60 0F74 0F82 0F7F ; MA #* ( ༂ → འུྂཿ ) TIBETAN MARK GTER YIG MGO -UM RNAM BCAD MA → TIBETAN LETTER -A, TIBETAN VOWEL SIGN U, TIBETAN SIGN NYI ZLA NAA DA, TIBETAN SIGN RNAM BCAD # + +0F03 ; 0F60 0F74 0F82 0F14 ; MA #* ( ༃ → འུྂ༔ ) TIBETAN MARK GTER YIG MGO -UM GTER TSHEG MA → TIBETAN LETTER -A, TIBETAN VOWEL SIGN U, TIBETAN SIGN NYI ZLA NAA DA, TIBETAN MARK GTER TSHEG # + +0F6A ; 0F62 ; MA # ( ཪ → ར ) TIBETAN LETTER FIXED-FORM RA → TIBETAN LETTER RA # + +0F00 ; 0F68 0F7C 0F7E ; MA # ( ༀ → ཨོཾ ) TIBETAN SYLLABLE OM → TIBETAN LETTER A, TIBETAN VOWEL SIGN O, TIBETAN SIGN RJES SU NGA RO # + +0F77 ; 0FB2 0F71 0F80 ; MA # ( ཷ → ྲཱྀ ) TIBETAN VOWEL SIGN VOCALIC RR → TIBETAN SUBJOINED LETTER RA, TIBETAN VOWEL SIGN AA, TIBETAN VOWEL SIGN REVERSED I # + +0F79 ; 0FB3 0F71 0F80 ; MA # ( ཹ → ླཱྀ ) TIBETAN VOWEL SIGN VOCALIC LL → TIBETAN SUBJOINED LETTER LA, TIBETAN VOWEL SIGN AA, TIBETAN VOWEL SIGN REVERSED I # + +0F7B ; 0F7A 0F7A ; MA # ( ཻ → ེེ ) TIBETAN VOWEL SIGN EE → TIBETAN VOWEL SIGN E, TIBETAN VOWEL SIGN E # + +0F7D ; 0F7C 0F7C ; MA # ( ཽ → ོོ ) TIBETAN VOWEL SIGN OO → TIBETAN VOWEL SIGN O, TIBETAN VOWEL SIGN O # + +11CB2 ; 11CAA ; MA # ( 𑲲 → 𑲪 ) MARCHEN VOWEL SIGN U → MARCHEN SUBJOINED LETTER RA # + +1734 ; 1715 ; MA # ( ᜴ → ᜕ ) HANUNOO SIGN PAMUDPOD → TAGALOG SIGN PAMUDPOD # + +1070 ; 1003 103E ; MA # ( ၰ → ဃှ ) MYANMAR LETTER EASTERN PWO KAREN GHWA → MYANMAR LETTER GHA, MYANMAR CONSONANT SIGN MEDIAL HA # + +1066 ; 1015 103E ; MA # ( ၦ → ပှ ) MYANMAR LETTER WESTERN PWO KAREN PWA → MYANMAR LETTER PA, MYANMAR CONSONANT SIGN MEDIAL HA # + +101F ; 1015 102C ; MA # ( ဟ → ပာ ) MYANMAR LETTER HA → MYANMAR LETTER PA, MYANMAR VOWEL SIGN AA # + +106F ; 1015 102C 103E ; MA # ( ၯ → ပာှ ) MYANMAR LETTER EASTERN PWO KAREN YWA → MYANMAR LETTER PA, MYANMAR VOWEL SIGN AA, MYANMAR CONSONANT SIGN MEDIAL HA # →ဟှ→ + +107E ; 107D 103E ; MA # ( ၾ → ၽှ ) MYANMAR LETTER SHAN FA → MYANMAR LETTER SHAN PHA, MYANMAR CONSONANT SIGN MEDIAL HA # + +1061 ; 101B 103E ; MA # ( ၡ → ရှ ) MYANMAR LETTER SGAW KAREN SHA → MYANMAR LETTER RA, MYANMAR CONSONANT SIGN MEDIAL HA # + +1029 ; 101E 103C ; MA # ( ဩ → သြ ) MYANMAR LETTER O → MYANMAR LETTER SA, MYANMAR CONSONANT SIGN MEDIAL RA # + +102A ; 101E 103C 0B47 102C 103A ; MA # ( ဪ → သြେာ် ) MYANMAR LETTER AU → MYANMAR LETTER SA, MYANMAR CONSONANT SIGN MEDIAL RA, ORIYA VOWEL SIGN E, MYANMAR VOWEL SIGN AA, MYANMAR SIGN ASAT # →ဩော်→ + +109E ; 1083 030A ; MA #* ( ႞ → ႃ̊ ) MYANMAR SYMBOL SHAN ONE → MYANMAR VOWEL SIGN SHAN AA, COMBINING RING ABOVE # →ႃံ→ + +178F ; 178A ; MA # ( ត → ដ ) KHMER LETTER TA → KHMER LETTER DA # + +17A3 ; 17A2 ; MA # ( ឣ → អ ) KHMER INDEPENDENT VOWEL QAQ → KHMER LETTER QA # + +19D0 ; 199E ; MA # ( ᧐ → ᦞ ) NEW TAI LUE DIGIT ZERO → NEW TAI LUE LETTER LOW VA # + +19D1 ; 19B1 ; MA # ( ᧑ → ᦱ ) NEW TAI LUE DIGIT ONE → NEW TAI LUE VOWEL SIGN AA # + +1A80 ; 1A45 ; MA # ( ᪀ → ᩅ ) TAI THAM HORA DIGIT ZERO → TAI THAM LETTER WA # +1A90 ; 1A45 ; MA # ( ᪐ → ᩅ ) TAI THAM THAM DIGIT ZERO → TAI THAM LETTER WA # + +AA53 ; AA01 ; MA # ( ꩓ → ꨁ ) CHAM DIGIT THREE → CHAM LETTER I # + +AA56 ; AA23 ; MA # ( ꩖ → ꨣ ) CHAM DIGIT SIX → CHAM LETTER RA # + +1B52 ; 1B0D ; MA # ( ᭒ → ᬍ ) BALINESE DIGIT TWO → BALINESE LETTER LA LENGA # + +1B53 ; 1B11 ; MA # ( ᭓ → ᬑ ) BALINESE DIGIT THREE → BALINESE LETTER OKARA # + +1B58 ; 1B28 ; MA # ( ᭘ → ᬨ ) BALINESE DIGIT EIGHT → BALINESE LETTER PA KAPAL # + +A9A3 ; A99D ; MA # ( ꦣ → ꦝ ) JAVANESE LETTER DA MAHAPRANA → JAVANESE LETTER DDA # + +1896 ; 185C ; MA # ( ᢖ → ᡜ ) MONGOLIAN LETTER ALI GALI ZA → MONGOLIAN LETTER TODO DZA # + +1855 ; 1835 ; MA # ( ᡕ → ᠵ ) MONGOLIAN LETTER TODO YA → MONGOLIAN LETTER JA # + +1FF6 ; 13EF ; MA # ( ῶ → Ꮿ ) GREEK SMALL LETTER OMEGA WITH PERISPOMENI → CHEROKEE LETTER YA # + +140D ; 1401 00B7 ; MA # ( ᐍ → ᐁ· ) CANADIAN SYLLABICS WEST-CREE WE → CANADIAN SYLLABICS E, MIDDLE DOT # →ᐁᐧ→ + +142B ; 1401 1420 ; MA # ( ᐫ → ᐁᐠ ) CANADIAN SYLLABICS EN → CANADIAN SYLLABICS E, CANADIAN SYLLABICS FINAL GRAVE # + +1411 ; 1404 00B7 ; MA # ( ᐑ → ᐄ· ) CANADIAN SYLLABICS WEST-CREE WII → CANADIAN SYLLABICS II, MIDDLE DOT # →ᐄᐧ→ + +1413 ; 1405 00B7 ; MA # ( ᐓ → ᐅ· ) CANADIAN SYLLABICS WEST-CREE WO → CANADIAN SYLLABICS O, MIDDLE DOT # →ᐅᐧ→ + +142D ; 1405 1420 ; MA # ( ᐭ → ᐅᐠ ) CANADIAN SYLLABICS ON → CANADIAN SYLLABICS O, CANADIAN SYLLABICS FINAL GRAVE # + +1415 ; 1406 00B7 ; MA # ( ᐕ → ᐆ· ) CANADIAN SYLLABICS WEST-CREE WOO → CANADIAN SYLLABICS OO, MIDDLE DOT # →ᐆᐧ→ + +1418 ; 140A 00B7 ; MA # ( ᐘ → ᐊ· ) CANADIAN SYLLABICS WEST-CREE WA → CANADIAN SYLLABICS A, MIDDLE DOT # →ᐊᐧ→ + +142E ; 140A 1420 ; MA # ( ᐮ → ᐊᐠ ) CANADIAN SYLLABICS AN → CANADIAN SYLLABICS A, CANADIAN SYLLABICS FINAL GRAVE # + +141A ; 140B 00B7 ; MA # ( ᐚ → ᐋ· ) CANADIAN SYLLABICS WEST-CREE WAA → CANADIAN SYLLABICS AA, MIDDLE DOT # →ᐋᐧ→ + +18DD ; 141E 18DF ; MA # ( ᣝ → ᐞᣟ ) CANADIAN SYLLABICS WESTERN W → CANADIAN SYLLABICS GLOTTAL STOP, CANADIAN SYLLABICS FINAL RAISED DOT # + +14D1 ; 1421 ; MA # ( ᓑ → ᐡ ) CANADIAN SYLLABICS CARRIER NG → CANADIAN SYLLABICS FINAL BOTTOM HALF RING # + +1540 ; 1429 ; MA # ( ᕀ → ᐩ ) CANADIAN SYLLABICS WEST-CREE Y → CANADIAN SYLLABICS FINAL PLUS # + +143F ; 1432 00B7 ; MA # ( ᐿ → ᐲ· ) CANADIAN SYLLABICS WEST-CREE PWII → CANADIAN SYLLABICS PII, MIDDLE DOT # →ᐲᐧ→ + +1443 ; 1434 00B7 ; MA # ( ᑃ → ᐴ· ) CANADIAN SYLLABICS WEST-CREE PWOO → CANADIAN SYLLABICS POO, MIDDLE DOT # →ᐴᐧ→ + +2369 ; 1435 ; MA #* ( ⍩ → ᐵ ) APL FUNCTIONAL SYMBOL GREATER-THAN DIAERESIS → CANADIAN SYLLABICS Y-CREE POO # + +1447 ; 1439 00B7 ; MA # ( ᑇ → ᐹ· ) CANADIAN SYLLABICS WEST-CREE PWAA → CANADIAN SYLLABICS PAA, MIDDLE DOT # →ᐹᐧ→ + +145C ; 144F 00B7 ; MA # ( ᑜ → ᑏ· ) CANADIAN SYLLABICS WEST-CREE TWII → CANADIAN SYLLABICS TII, MIDDLE DOT # →ᑏᐧ→ + +2E27 ; 1450 ; MA #* ( ⸧ → ᑐ ) RIGHT SIDEWAYS U BRACKET → CANADIAN SYLLABICS TO # →⊃→ +2283 ; 1450 ; MA #* ( ⊃ → ᑐ ) SUPERSET OF → CANADIAN SYLLABICS TO # + +145E ; 1450 00B7 ; MA # ( ᑞ → ᑐ· ) CANADIAN SYLLABICS WEST-CREE TWO → CANADIAN SYLLABICS TO, MIDDLE DOT # →ᑐᐧ→ + +1469 ; 1450 0027 ; MA # ( ᑩ → ᑐ' ) CANADIAN SYLLABICS TTO → CANADIAN SYLLABICS TO, APOSTROPHE # →ᑐᑊ→ + +27C9 ; 1450 002F ; MA #* ( ⟉ → ᑐ/ ) SUPERSET PRECEDING SOLIDUS → CANADIAN SYLLABICS TO, SOLIDUS # →⊃/→ + +2AD7 ; 1450 1455 ; MA #* ( ⫗ → ᑐᑕ ) SUPERSET BESIDE SUBSET → CANADIAN SYLLABICS TO, CANADIAN SYLLABICS TA # →⊃⊂→ + +1460 ; 1451 00B7 ; MA # ( ᑠ → ᑑ· ) CANADIAN SYLLABICS WEST-CREE TWOO → CANADIAN SYLLABICS TOO, MIDDLE DOT # →ᑑᐧ→ + +2E26 ; 1455 ; MA #* ( ⸦ → ᑕ ) LEFT SIDEWAYS U BRACKET → CANADIAN SYLLABICS TA # →⊂→ +2282 ; 1455 ; MA #* ( ⊂ → ᑕ ) SUBSET OF → CANADIAN SYLLABICS TA # + +1462 ; 1455 00B7 ; MA # ( ᑢ → ᑕ· ) CANADIAN SYLLABICS WEST-CREE TWA → CANADIAN SYLLABICS TA, MIDDLE DOT # →ᑕᐧ→ + +146A ; 1455 0027 ; MA # ( ᑪ → ᑕ' ) CANADIAN SYLLABICS TTA → CANADIAN SYLLABICS TA, APOSTROPHE # →ᑕᑊ→ + +1464 ; 1456 00B7 ; MA # ( ᑤ → ᑖ· ) CANADIAN SYLLABICS WEST-CREE TWAA → CANADIAN SYLLABICS TAA, MIDDLE DOT # →ᑖᐧ→ + +1475 ; 146B 00B7 ; MA # ( ᑵ → ᑫ· ) CANADIAN SYLLABICS WEST-CREE KWE → CANADIAN SYLLABICS KE, MIDDLE DOT # →ᑫᐧ→ + +1485 ; 146B 0027 ; MA # ( ᒅ → ᑫ' ) CANADIAN SYLLABICS SOUTH-SLAVEY KEH → CANADIAN SYLLABICS KE, APOSTROPHE # →ᑫᑊ→ + +1479 ; 146E 00B7 ; MA # ( ᑹ → ᑮ· ) CANADIAN SYLLABICS WEST-CREE KWII → CANADIAN SYLLABICS KII, MIDDLE DOT # →ᑮᐧ→ + +147D ; 1470 00B7 ; MA # ( ᑽ → ᑰ· ) CANADIAN SYLLABICS WEST-CREE KWOO → CANADIAN SYLLABICS KOO, MIDDLE DOT # →ᑰᐧ→ + +1603 ; 1489 ; MA # ( ᘃ → ᒉ ) CANADIAN SYLLABICS CARRIER NO → CANADIAN SYLLABICS CE # + +1493 ; 1489 00B7 ; MA # ( ᒓ → ᒉ· ) CANADIAN SYLLABICS WEST-CREE CWE → CANADIAN SYLLABICS CE, MIDDLE DOT # →ᒉᐧ→ + +1495 ; 148B 00B7 ; MA # ( ᒕ → ᒋ· ) CANADIAN SYLLABICS WEST-CREE CWI → CANADIAN SYLLABICS CI, MIDDLE DOT # →ᒋᐧ→ + +1497 ; 148C 00B7 ; MA # ( ᒗ → ᒌ· ) CANADIAN SYLLABICS WEST-CREE CWII → CANADIAN SYLLABICS CII, MIDDLE DOT # →ᒌᐧ→ + +149B ; 148E 00B7 ; MA # ( ᒛ → ᒎ· ) CANADIAN SYLLABICS WEST-CREE CWOO → CANADIAN SYLLABICS COO, MIDDLE DOT # →ᒎᐧ→ + +1602 ; 1490 ; MA # ( ᘂ → ᒐ ) CANADIAN SYLLABICS CARRIER NU → CANADIAN SYLLABICS CA # + +149D ; 1490 00B7 ; MA # ( ᒝ → ᒐ· ) CANADIAN SYLLABICS WEST-CREE CWA → CANADIAN SYLLABICS CA, MIDDLE DOT # →ᒐᐧ→ + +149F ; 1491 00B7 ; MA # ( ᒟ → ᒑ· ) CANADIAN SYLLABICS WEST-CREE CWAA → CANADIAN SYLLABICS CAA, MIDDLE DOT # →ᒑᐧ→ + +14AD ; 14A3 00B7 ; MA # ( ᒭ → ᒣ· ) CANADIAN SYLLABICS WEST-CREE MWE → CANADIAN SYLLABICS ME, MIDDLE DOT # →ᒣᐧ→ + +14B1 ; 14A6 00B7 ; MA # ( ᒱ → ᒦ· ) CANADIAN SYLLABICS WEST-CREE MWII → CANADIAN SYLLABICS MII, MIDDLE DOT # →ᒦᐧ→ + +14B3 ; 14A7 00B7 ; MA # ( ᒳ → ᒧ· ) CANADIAN SYLLABICS WEST-CREE MWO → CANADIAN SYLLABICS MO, MIDDLE DOT # →ᒧᐧ→ + +14B5 ; 14A8 00B7 ; MA # ( ᒵ → ᒨ· ) CANADIAN SYLLABICS WEST-CREE MWOO → CANADIAN SYLLABICS MOO, MIDDLE DOT # →ᒨᐧ→ + +14B9 ; 14AB 00B7 ; MA # ( ᒹ → ᒫ· ) CANADIAN SYLLABICS WEST-CREE MWAA → CANADIAN SYLLABICS MAA, MIDDLE DOT # →ᒫᐧ→ + +14CA ; 14C0 00B7 ; MA # ( ᓊ → ᓀ· ) CANADIAN SYLLABICS WEST-CREE NWE → CANADIAN SYLLABICS NE, MIDDLE DOT # →ᓀᐧ→ + +18C7 ; 14C2 00B7 ; MA # ( ᣇ → ᓂ· ) CANADIAN SYLLABICS OJIBWAY NWI → CANADIAN SYLLABICS NI, MIDDLE DOT # →ᓂᐧ→ + +18C9 ; 14C3 00B7 ; MA # ( ᣉ → ᓃ· ) CANADIAN SYLLABICS OJIBWAY NWII → CANADIAN SYLLABICS NII, MIDDLE DOT # →ᓃᐧ→ + +18CB ; 14C4 00B7 ; MA # ( ᣋ → ᓄ· ) CANADIAN SYLLABICS OJIBWAY NWO → CANADIAN SYLLABICS NO, MIDDLE DOT # →ᓄᐧ→ + +18CD ; 14C5 00B7 ; MA # ( ᣍ → ᓅ· ) CANADIAN SYLLABICS OJIBWAY NWOO → CANADIAN SYLLABICS NOO, MIDDLE DOT # →ᓅᐧ→ + +14CC ; 14C7 00B7 ; MA # ( ᓌ → ᓇ· ) CANADIAN SYLLABICS WEST-CREE NWA → CANADIAN SYLLABICS NA, MIDDLE DOT # →ᓇᐧ→ + +14CE ; 14C8 00B7 ; MA # ( ᓎ → ᓈ· ) CANADIAN SYLLABICS WEST-CREE NWAA → CANADIAN SYLLABICS NAA, MIDDLE DOT # →ᓈᐧ→ + +1604 ; 14D3 ; MA # ( ᘄ → ᓓ ) CANADIAN SYLLABICS CARRIER NE → CANADIAN SYLLABICS LE # + +14DD ; 14D3 00B7 ; MA # ( ᓝ → ᓓ· ) CANADIAN SYLLABICS WEST-CREE LWE → CANADIAN SYLLABICS LE, MIDDLE DOT # →ᓓᐧ→ + +14DF ; 14D5 00B7 ; MA # ( ᓟ → ᓕ· ) CANADIAN SYLLABICS WEST-CREE LWI → CANADIAN SYLLABICS LI, MIDDLE DOT # →ᓕᐧ→ + +14E1 ; 14D6 00B7 ; MA # ( ᓡ → ᓖ· ) CANADIAN SYLLABICS WEST-CREE LWII → CANADIAN SYLLABICS LII, MIDDLE DOT # →ᓖᐧ→ + +14E3 ; 14D7 00B7 ; MA # ( ᓣ → ᓗ· ) CANADIAN SYLLABICS WEST-CREE LWO → CANADIAN SYLLABICS LO, MIDDLE DOT # →ᓗᐧ→ + +14E5 ; 14D8 00B7 ; MA # ( ᓥ → ᓘ· ) CANADIAN SYLLABICS WEST-CREE LWOO → CANADIAN SYLLABICS LOO, MIDDLE DOT # →ᓘᐧ→ + +1607 ; 14DA ; MA # ( ᘇ → ᓚ ) CANADIAN SYLLABICS CARRIER NA → CANADIAN SYLLABICS LA # + +14E7 ; 14DA 00B7 ; MA # ( ᓧ → ᓚ· ) CANADIAN SYLLABICS WEST-CREE LWA → CANADIAN SYLLABICS LA, MIDDLE DOT # →ᓚᐧ→ + +14E9 ; 14DB 00B7 ; MA # ( ᓩ → ᓛ· ) CANADIAN SYLLABICS WEST-CREE LWAA → CANADIAN SYLLABICS LAA, MIDDLE DOT # →ᓛᐧ→ + +14F7 ; 14ED 00B7 ; MA # ( ᓷ → ᓭ· ) CANADIAN SYLLABICS WEST-CREE SWE → CANADIAN SYLLABICS SE, MIDDLE DOT # →ᓭᐧ→ + +14F9 ; 14EF 00B7 ; MA # ( ᓹ → ᓯ· ) CANADIAN SYLLABICS WEST-CREE SWI → CANADIAN SYLLABICS SI, MIDDLE DOT # →ᓯᐧ→ + +14FB ; 14F0 00B7 ; MA # ( ᓻ → ᓰ· ) CANADIAN SYLLABICS WEST-CREE SWII → CANADIAN SYLLABICS SII, MIDDLE DOT # →ᓰᐧ→ + +14FD ; 14F1 00B7 ; MA # ( ᓽ → ᓱ· ) CANADIAN SYLLABICS WEST-CREE SWO → CANADIAN SYLLABICS SO, MIDDLE DOT # →ᓱᐧ→ + +14FF ; 14F2 00B7 ; MA # ( ᓿ → ᓲ· ) CANADIAN SYLLABICS WEST-CREE SWOO → CANADIAN SYLLABICS SOO, MIDDLE DOT # →ᓲᐧ→ + +1501 ; 14F4 00B7 ; MA # ( ᔁ → ᓴ· ) CANADIAN SYLLABICS WEST-CREE SWA → CANADIAN SYLLABICS SA, MIDDLE DOT # →ᓴᐧ→ + +1503 ; 14F5 00B7 ; MA # ( ᔃ → ᓵ· ) CANADIAN SYLLABICS WEST-CREE SWAA → CANADIAN SYLLABICS SAA, MIDDLE DOT # →ᓵᐧ→ + +150C ; 150B 003C ; MA # ( ᔌ → ᔋ< ) CANADIAN SYLLABICS NASKAPI SPWA → CANADIAN SYLLABICS NASKAPI S-W, LESS-THAN SIGN # →ᔋᐸ→ + +150E ; 150B 0062 ; MA # ( ᔎ → ᔋb ) CANADIAN SYLLABICS NASKAPI SKWA → CANADIAN SYLLABICS NASKAPI S-W, LATIN SMALL LETTER B # →ᔋᑲ→ + +150D ; 150B 1455 ; MA # ( ᔍ → ᔋᑕ ) CANADIAN SYLLABICS NASKAPI STWA → CANADIAN SYLLABICS NASKAPI S-W, CANADIAN SYLLABICS TA # + +150F ; 150B 1490 ; MA # ( ᔏ → ᔋᒐ ) CANADIAN SYLLABICS NASKAPI SCWA → CANADIAN SYLLABICS NASKAPI S-W, CANADIAN SYLLABICS CA # + +1518 ; 1510 00B7 ; MA # ( ᔘ → ᔐ· ) CANADIAN SYLLABICS WEST-CREE SHWE → CANADIAN SYLLABICS SHE, MIDDLE DOT # →ᔐᐧ→ + +151A ; 1511 00B7 ; MA # ( ᔚ → ᔑ· ) CANADIAN SYLLABICS WEST-CREE SHWI → CANADIAN SYLLABICS SHI, MIDDLE DOT # →ᔑᐧ→ + +151C ; 1512 00B7 ; MA # ( ᔜ → ᔒ· ) CANADIAN SYLLABICS WEST-CREE SHWII → CANADIAN SYLLABICS SHII, MIDDLE DOT # →ᔒᐧ→ + +151E ; 1513 00B7 ; MA # ( ᔞ → ᔓ· ) CANADIAN SYLLABICS WEST-CREE SHWO → CANADIAN SYLLABICS SHO, MIDDLE DOT # →ᔓᐧ→ + +1520 ; 1514 00B7 ; MA # ( ᔠ → ᔔ· ) CANADIAN SYLLABICS WEST-CREE SHWOO → CANADIAN SYLLABICS SHOO, MIDDLE DOT # →ᔔᐧ→ + +1522 ; 1515 00B7 ; MA # ( ᔢ → ᔕ· ) CANADIAN SYLLABICS WEST-CREE SHWA → CANADIAN SYLLABICS SHA, MIDDLE DOT # →ᔕᐧ→ + +1524 ; 1516 00B7 ; MA # ( ᔤ → ᔖ· ) CANADIAN SYLLABICS WEST-CREE SHWAA → CANADIAN SYLLABICS SHAA, MIDDLE DOT # →ᔖᐧ→ + +1532 ; 1528 00B7 ; MA # ( ᔲ → ᔨ· ) CANADIAN SYLLABICS WEST-CREE YWI → CANADIAN SYLLABICS YI, MIDDLE DOT # →ᔨᐧ→ + +1534 ; 1529 00B7 ; MA # ( ᔴ → ᔩ· ) CANADIAN SYLLABICS WEST-CREE YWII → CANADIAN SYLLABICS YII, MIDDLE DOT # →ᔩᐧ→ + +1536 ; 152A 00B7 ; MA # ( ᔶ → ᔪ· ) CANADIAN SYLLABICS WEST-CREE YWO → CANADIAN SYLLABICS YO, MIDDLE DOT # →ᔪᐧ→ + +1538 ; 152B 00B7 ; MA # ( ᔸ → ᔫ· ) CANADIAN SYLLABICS WEST-CREE YWOO → CANADIAN SYLLABICS YOO, MIDDLE DOT # →ᔫᐧ→ + +153A ; 152D 00B7 ; MA # ( ᔺ → ᔭ· ) CANADIAN SYLLABICS WEST-CREE YWA → CANADIAN SYLLABICS YA, MIDDLE DOT # →ᔭᐧ→ + +153C ; 152E 00B7 ; MA # ( ᔼ → ᔮ· ) CANADIAN SYLLABICS WEST-CREE YWAA → CANADIAN SYLLABICS YAA, MIDDLE DOT # →ᔮᐧ→ + +1622 ; 1543 ; MA # ( ᘢ → ᕃ ) CANADIAN SYLLABICS CARRIER LU → CANADIAN SYLLABICS R-CREE RE # + +18E0 ; 1543 00B7 ; MA # ( ᣠ → ᕃ· ) CANADIAN SYLLABICS R-CREE RWE → CANADIAN SYLLABICS R-CREE RE, MIDDLE DOT # →ᕃᐧ→ + +1623 ; 1546 ; MA # ( ᘣ → ᕆ ) CANADIAN SYLLABICS CARRIER LO → CANADIAN SYLLABICS RI # + +1624 ; 154A ; MA # ( ᘤ → ᕊ ) CANADIAN SYLLABICS CARRIER LE → CANADIAN SYLLABICS WEST-CREE LO # + +154F ; 154C 00B7 ; MA # ( ᕏ → ᕌ· ) CANADIAN SYLLABICS WEST-CREE RWAA → CANADIAN SYLLABICS RAA, MIDDLE DOT # →ᕌᐧ→ + +1583 ; 1550 0062 ; MA # ( ᖃ → ᕐb ) CANADIAN SYLLABICS QA → CANADIAN SYLLABICS R, LATIN SMALL LETTER B # →ᕐᑲ→ + +1584 ; 1550 0062 0307 ; MA # ( ᖄ → ᕐḃ ) CANADIAN SYLLABICS QAA → CANADIAN SYLLABICS R, LATIN SMALL LETTER B, COMBINING DOT ABOVE # →ᕐᑳ→ + +1581 ; 1550 0064 ; MA # ( ᖁ → ᕐd ) CANADIAN SYLLABICS QO → CANADIAN SYLLABICS R, LATIN SMALL LETTER D # →ᕐᑯ→ + +157F ; 1550 0050 ; MA # ( ᕿ → ᕐP ) CANADIAN SYLLABICS QI → CANADIAN SYLLABICS R, LATIN CAPITAL LETTER P # →ᕐᑭ→ + +166F ; 1550 146B ; MA # ( ᙯ → ᕐᑫ ) CANADIAN SYLLABICS QAI → CANADIAN SYLLABICS R, CANADIAN SYLLABICS KE # + +157E ; 1550 146C ; MA # ( ᕾ → ᕐᑬ ) CANADIAN SYLLABICS QAAI → CANADIAN SYLLABICS R, CANADIAN SYLLABICS KAAI # + +1580 ; 1550 146E ; MA # ( ᖀ → ᕐᑮ ) CANADIAN SYLLABICS QII → CANADIAN SYLLABICS R, CANADIAN SYLLABICS KII # + +1582 ; 1550 1470 ; MA # ( ᖂ → ᕐᑰ ) CANADIAN SYLLABICS QOO → CANADIAN SYLLABICS R, CANADIAN SYLLABICS KOO # + +1585 ; 1550 1483 ; MA # ( ᖅ → ᕐᒃ ) CANADIAN SYLLABICS Q → CANADIAN SYLLABICS R, CANADIAN SYLLABICS K # + +155C ; 155A 00B7 ; MA # ( ᕜ → ᕚ· ) CANADIAN SYLLABICS WEST-CREE FWAA → CANADIAN SYLLABICS FAA, MIDDLE DOT # →ᕚᐧ→ + +18E3 ; 155E 00B7 ; MA # ( ᣣ → ᕞ· ) CANADIAN SYLLABICS THWE → CANADIAN SYLLABICS THE, MIDDLE DOT # →ᕞᐧ→ + +18E4 ; 1566 00B7 ; MA # ( ᣤ → ᕦ· ) CANADIAN SYLLABICS THWA → CANADIAN SYLLABICS THA, MIDDLE DOT # →ᕦᐧ→ + +1569 ; 1567 00B7 ; MA # ( ᕩ → ᕧ· ) CANADIAN SYLLABICS WEST-CREE THWAA → CANADIAN SYLLABICS THAA, MIDDLE DOT # →ᕧᐧ→ + +18E5 ; 156B 00B7 ; MA # ( ᣥ → ᕫ· ) CANADIAN SYLLABICS TTHWE → CANADIAN SYLLABICS TTHE, MIDDLE DOT # →ᕫᐧ→ + +18E8 ; 1586 00B7 ; MA # ( ᣨ → ᖆ· ) CANADIAN SYLLABICS TLHWE → CANADIAN SYLLABICS TLHE, MIDDLE DOT # →ᖆᐧ→ + +1591 ; 1595 004A ; MA # ( ᖑ → ᖕJ ) CANADIAN SYLLABICS NGO → CANADIAN SYLLABICS NG, LATIN CAPITAL LETTER J # →ᖕᒍ→ + +1670 ; 1595 1489 ; MA # ( ᙰ → ᖕᒉ ) CANADIAN SYLLABICS NGAI → CANADIAN SYLLABICS NG, CANADIAN SYLLABICS CE # + +158E ; 1595 148A ; MA # ( ᖎ → ᖕᒊ ) CANADIAN SYLLABICS NGAAI → CANADIAN SYLLABICS NG, CANADIAN SYLLABICS CAAI # + +158F ; 1595 148B ; MA # ( ᖏ → ᖕᒋ ) CANADIAN SYLLABICS NGI → CANADIAN SYLLABICS NG, CANADIAN SYLLABICS CI # + +1590 ; 1595 148C ; MA # ( ᖐ → ᖕᒌ ) CANADIAN SYLLABICS NGII → CANADIAN SYLLABICS NG, CANADIAN SYLLABICS CII # + +1592 ; 1595 148E ; MA # ( ᖒ → ᖕᒎ ) CANADIAN SYLLABICS NGOO → CANADIAN SYLLABICS NG, CANADIAN SYLLABICS COO # + +1593 ; 1595 1490 ; MA # ( ᖓ → ᖕᒐ ) CANADIAN SYLLABICS NGA → CANADIAN SYLLABICS NG, CANADIAN SYLLABICS CA # + +1594 ; 1595 1491 ; MA # ( ᖔ → ᖕᒑ ) CANADIAN SYLLABICS NGAA → CANADIAN SYLLABICS NG, CANADIAN SYLLABICS CAA # + +1673 ; 1596 004A ; MA # ( ᙳ → ᖖJ ) CANADIAN SYLLABICS NNGO → CANADIAN SYLLABICS NNG, LATIN CAPITAL LETTER J # →ᖖᒍ→ + +1671 ; 1596 148B ; MA # ( ᙱ → ᖖᒋ ) CANADIAN SYLLABICS NNGI → CANADIAN SYLLABICS NNG, CANADIAN SYLLABICS CI # + +1672 ; 1596 148C ; MA # ( ᙲ → ᖖᒌ ) CANADIAN SYLLABICS NNGII → CANADIAN SYLLABICS NNG, CANADIAN SYLLABICS CII # + +1674 ; 1596 148E ; MA # ( ᙴ → ᖖᒎ ) CANADIAN SYLLABICS NNGOO → CANADIAN SYLLABICS NNG, CANADIAN SYLLABICS COO # + +1675 ; 1596 1490 ; MA # ( ᙵ → ᖖᒐ ) CANADIAN SYLLABICS NNGA → CANADIAN SYLLABICS NNG, CANADIAN SYLLABICS CA # + +1676 ; 1596 1491 ; MA # ( ᙶ → ᖖᒑ ) CANADIAN SYLLABICS NNGAA → CANADIAN SYLLABICS NNG, CANADIAN SYLLABICS CAA # + +18EA ; 1597 00B7 ; MA # ( ᣪ → ᖗ· ) CANADIAN SYLLABICS SAYISI SHWE → CANADIAN SYLLABICS SAYISI SHE, MIDDLE DOT # →ᖗᐧ→ + +1677 ; 15A7 00B7 ; MA # ( ᙷ → ᖧ· ) CANADIAN SYLLABICS WOODS-CREE THWEE → CANADIAN SYLLABICS TH-CREE THE, MIDDLE DOT # →ᖧᐧ→ + +1678 ; 15A8 00B7 ; MA # ( ᙸ → ᖨ· ) CANADIAN SYLLABICS WOODS-CREE THWI → CANADIAN SYLLABICS TH-CREE THI, MIDDLE DOT # →ᖨᐧ→ + +1679 ; 15A9 00B7 ; MA # ( ᙹ → ᖩ· ) CANADIAN SYLLABICS WOODS-CREE THWII → CANADIAN SYLLABICS TH-CREE THII, MIDDLE DOT # →ᖩᐧ→ + +167A ; 15AA 00B7 ; MA # ( ᙺ → ᖪ· ) CANADIAN SYLLABICS WOODS-CREE THWO → CANADIAN SYLLABICS TH-CREE THO, MIDDLE DOT # →ᖪᐧ→ + +167B ; 15AB 00B7 ; MA # ( ᙻ → ᖫ· ) CANADIAN SYLLABICS WOODS-CREE THWOO → CANADIAN SYLLABICS TH-CREE THOO, MIDDLE DOT # →ᖫᐧ→ + +167C ; 15AC 00B7 ; MA # ( ᙼ → ᖬ· ) CANADIAN SYLLABICS WOODS-CREE THWA → CANADIAN SYLLABICS TH-CREE THA, MIDDLE DOT # →ᖬᐧ→ + +167D ; 15AD 00B7 ; MA # ( ᙽ → ᖭ· ) CANADIAN SYLLABICS WOODS-CREE THWAA → CANADIAN SYLLABICS TH-CREE THAA, MIDDLE DOT # →ᖭᐧ→ + +2AAB ; 15D2 ; MA #* ( ⪫ → ᗒ ) LARGER THAN → CANADIAN SYLLABICS CARRIER WE # + +2AAA ; 15D5 ; MA #* ( ⪪ → ᗕ ) SMALLER THAN → CANADIAN SYLLABICS CARRIER WA # + +A4F7 ; 15E1 ; MA # ( ꓷ → ᗡ ) LISU LETTER OE → CANADIAN SYLLABICS CARRIER THA # + +18F0 ; 15F4 00B7 ; MA # ( ᣰ → ᗴ· ) CANADIAN SYLLABICS CARRIER GWA → CANADIAN SYLLABICS CARRIER GA, MIDDLE DOT # →ᗴᐧ→ + +18F2 ; 161B 00B7 ; MA # ( ᣲ → ᘛ· ) CANADIAN SYLLABICS CARRIER JWA → CANADIAN SYLLABICS CARRIER JA, MIDDLE DOT # →ᘛᐧ→ + +1DBB ; 1646 ; MA # ( ᶻ → ᙆ ) MODIFIER LETTER SMALL Z → CANADIAN SYLLABICS CARRIER Z # + +A4ED ; 1660 ; MA # ( ꓭ → ᙠ ) LISU LETTER GHA → CANADIAN SYLLABICS CARRIER TSA # + +1DBA ; 18D4 ; MA # ( ᶺ → ᣔ ) MODIFIER LETTER SMALL TURNED V → CANADIAN SYLLABICS OJIBWAY P # + +1D3E ; 18D6 ; MA # ( ᴾ → ᣖ ) MODIFIER LETTER CAPITAL P → CANADIAN SYLLABICS OJIBWAY K # + +18DC ; 18DF 141E ; MA # ( ᣜ → ᣟᐞ ) CANADIAN SYLLABICS EASTERN W → CANADIAN SYLLABICS FINAL RAISED DOT, CANADIAN SYLLABICS GLOTTAL STOP # + +02E1 ; 18F3 ; MA # ( ˡ → ᣳ ) MODIFIER LETTER SMALL L → CANADIAN SYLLABICS BEAVER DENE L # + +02B3 ; 18F4 ; MA # ( ʳ → ᣴ ) MODIFIER LETTER SMALL R → CANADIAN SYLLABICS BEAVER DENE R # + +02E2 ; 18F5 ; MA # ( ˢ → ᣵ ) MODIFIER LETTER SMALL S → CANADIAN SYLLABICS CARRIER DENTAL S # +18DB ; 18F5 ; MA # ( ᣛ → ᣵ ) CANADIAN SYLLABICS OJIBWAY SH → CANADIAN SYLLABICS CARRIER DENTAL S # →ˢ→ +A7F1 ; 18F5 ; MA # ( ꟱ → ᣵ ) MODIFIER LETTER CAPITAL S → CANADIAN SYLLABICS CARRIER DENTAL S # →ˢ→ + +A6B0 ; 16B9 ; MA # ( ꚰ → ᚹ ) BAMUM LETTER TAA → RUNIC LETTER WUNJO WYNN W # + +16E1 ; 16BC ; MA # ( ᛡ → ᚼ ) RUNIC LETTER IOR → RUNIC LETTER LONG-BRANCH-HAGALL H # + +237F ; 16BD ; MA #* ( ⍿ → ᚽ ) VERTICAL LINE WITH MIDDLE DOT → RUNIC LETTER SHORT-TWIG-HAGALL H # →ᛂ→ +16C2 ; 16BD ; MA # ( ᛂ → ᚽ ) RUNIC LETTER E → RUNIC LETTER SHORT-TWIG-HAGALL H # + +1D23F ; 16CB ; MA #* ( 𝈿 → ᛋ ) GREEK INSTRUMENTAL NOTATION SYMBOL-52 → RUNIC LETTER SIGEL LONG-BRANCH-SOL S # + +2191 ; 16CF ; MA #* ( ↑ → ᛏ ) UPWARDS ARROW → RUNIC LETTER TIWAZ TIR TYR T # + +21BF ; 16D0 ; MA #* ( ↿ → ᛐ ) UPWARDS HARPOON WITH BARB LEFTWARDS → RUNIC LETTER SHORT-TWIG-TYR T # + +296E ; 16D0 21C2 ; MA #* ( ⥮ → ᛐ⇂ ) UPWARDS HARPOON WITH BARB LEFT BESIDE DOWNWARDS HARPOON WITH BARB RIGHT → RUNIC LETTER SHORT-TWIG-TYR T, DOWNWARDS HARPOON WITH BARB RIGHTWARDS # →↿⇂→ + +2963 ; 16D0 16DA ; MA #* ( ⥣ → ᛐᛚ ) UPWARDS HARPOON WITH BARB LEFT BESIDE UPWARDS HARPOON WITH BARB RIGHT → RUNIC LETTER SHORT-TWIG-TYR T, RUNIC LETTER LAUKAZ LAGU LOGR L # →↿↾→ + +2D63 ; 16EF ; MA # ( ⵣ → ᛯ ) TIFINAGH LETTER YAZ → RUNIC TVIMADUR SYMBOL # + +21BE ; 16DA ; MA #* ( ↾ → ᛚ ) UPWARDS HARPOON WITH BARB RIGHTWARDS → RUNIC LETTER LAUKAZ LAGU LOGR L # +2A21 ; 16DA ; MA #* ( ⨡ → ᛚ ) Z NOTATION SCHEMA PROJECTION → RUNIC LETTER LAUKAZ LAGU LOGR L # →↾→ + +22C4 ; 16DC ; MA #* ( ⋄ → ᛜ ) DIAMOND OPERATOR → RUNIC LETTER INGWAZ # →◇→ +25C7 ; 16DC ; MA #* ( ◇ → ᛜ ) WHITE DIAMOND → RUNIC LETTER INGWAZ # +25CA ; 16DC ; MA #* ( ◊ → ᛜ ) LOZENGE → RUNIC LETTER INGWAZ # →⋄→→◇→ +2662 ; 16DC ; MA #* ( ♢ → ᛜ ) WHITE DIAMOND SUIT → RUNIC LETTER INGWAZ # →◊→→⋄→→◇→ +1F754 ; 16DC ; MA #* ( 🝔 → ᛜ ) ALCHEMICAL SYMBOL FOR SOAP → RUNIC LETTER INGWAZ # →◇→ +118B7 ; 16DC ; MA # ( 𑢷 → ᛜ ) WARANG CITI CAPITAL LETTER BU → RUNIC LETTER INGWAZ # →◇→ +10294 ; 16DC ; MA # ( 𐊔 → ᛜ ) LYCIAN LETTER KK → RUNIC LETTER INGWAZ # →◇→ + +235A ; 16DC 0332 ; MA #* ( ⍚ → ᛜ̲ ) APL FUNCTIONAL SYMBOL DIAMOND UNDERBAR → RUNIC LETTER INGWAZ, COMBINING LOW LINE # →◇̲→ + +22C8 ; 16DE ; MA #* ( ⋈ → ᛞ ) BOWTIE → RUNIC LETTER DAGAZ DAEG D # +2A1D ; 16DE ; MA #* ( ⨝ → ᛞ ) JOIN → RUNIC LETTER DAGAZ DAEG D # →⋈→ + +104D0 ; 16E6 ; MA # ( 𐓐 → ᛦ ) OSAGE CAPITAL LETTER KHA → RUNIC LETTER LONG-BRANCH-YR # + +2195 ; 16E8 ; MA #* ( ↕ → ᛨ ) UP DOWN ARROW → RUNIC LETTER ICELANDIC-YR # + +10CFC ; 10C82 ; MA #* ( ‎𐳼‎ → ‎𐲂‎ ) OLD HUNGARIAN NUMBER TEN → OLD HUNGARIAN CAPITAL LETTER EB # + +10CFA ; 10CA5 ; MA #* ( ‎𐳺‎ → ‎𐲥‎ ) OLD HUNGARIAN NUMBER ONE → OLD HUNGARIAN CAPITAL LETTER ESZ # + +3131 ; 1100 ; MA # ( ㄱ → ᄀ ) HANGUL LETTER KIYEOK → HANGUL CHOSEONG KIYEOK # +11A8 ; 1100 ; MA # ( ᆨ → ᄀ ) HANGUL JONGSEONG KIYEOK → HANGUL CHOSEONG KIYEOK # + +1101 ; 1100 1100 ; MA # ( ᄁ → ᄀᄀ ) HANGUL CHOSEONG SSANGKIYEOK → HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG KIYEOK # +3132 ; 1100 1100 ; MA # ( ㄲ → ᄀᄀ ) HANGUL LETTER SSANGKIYEOK → HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG KIYEOK # →ᄁ→ +11A9 ; 1100 1100 ; MA # ( ᆩ → ᄀᄀ ) HANGUL JONGSEONG SSANGKIYEOK → HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG KIYEOK # →ᄁ→ + +11FA ; 1100 1102 ; MA # ( ᇺ → ᄀᄂ ) HANGUL JONGSEONG KIYEOK-NIEUN → HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG NIEUN # →ᆨᆫ→ + +115A ; 1100 1103 ; MA # ( ᅚ → ᄀᄃ ) HANGUL CHOSEONG KIYEOK-TIKEUT → HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG TIKEUT # + +11C3 ; 1100 1105 ; MA # ( ᇃ → ᄀᄅ ) HANGUL JONGSEONG KIYEOK-RIEUL → HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG RIEUL # →ᆨᆯ→ + +11FB ; 1100 1107 ; MA # ( ᇻ → ᄀᄇ ) HANGUL JONGSEONG KIYEOK-PIEUP → HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG PIEUP # →ᆨᆸ→ + +11AA ; 1100 1109 ; MA # ( ᆪ → ᄀᄉ ) HANGUL JONGSEONG KIYEOK-SIOS → HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG SIOS # →ᆨᆺ→ +3133 ; 1100 1109 ; MA # ( ㄳ → ᄀᄉ ) HANGUL LETTER KIYEOK-SIOS → HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG SIOS # →ᆪ→→ᆨᆺ→ + +11C4 ; 1100 1109 1100 ; MA # ( ᇄ → ᄀᄉᄀ ) HANGUL JONGSEONG KIYEOK-SIOS-KIYEOK → HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG KIYEOK # →ᆨᆺᆨ→ + +11FC ; 1100 110E ; MA # ( ᇼ → ᄀᄎ ) HANGUL JONGSEONG KIYEOK-CHIEUCH → HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG CHIEUCH # →ᆨᆾ→ + +11FD ; 1100 110F ; MA # ( ᇽ → ᄀᄏ ) HANGUL JONGSEONG KIYEOK-KHIEUKH → HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG KHIEUKH # →ᆨᆿ→ + +11FE ; 1100 1112 ; MA # ( ᇾ → ᄀᄒ ) HANGUL JONGSEONG KIYEOK-HIEUH → HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG HIEUH # →ᆨᇂ→ + +3134 ; 1102 ; MA # ( ㄴ → ᄂ ) HANGUL LETTER NIEUN → HANGUL CHOSEONG NIEUN # +11AB ; 1102 ; MA # ( ᆫ → ᄂ ) HANGUL JONGSEONG NIEUN → HANGUL CHOSEONG NIEUN # + +1113 ; 1102 1100 ; MA # ( ᄓ → ᄂᄀ ) HANGUL CHOSEONG NIEUN-KIYEOK → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG KIYEOK # +11C5 ; 1102 1100 ; MA # ( ᇅ → ᄂᄀ ) HANGUL JONGSEONG NIEUN-KIYEOK → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG KIYEOK # →ᄓ→ + +1114 ; 1102 1102 ; MA # ( ᄔ → ᄂᄂ ) HANGUL CHOSEONG SSANGNIEUN → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG NIEUN # +3165 ; 1102 1102 ; MA # ( ㅥ → ᄂᄂ ) HANGUL LETTER SSANGNIEUN → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG NIEUN # →ᄔ→ +11FF ; 1102 1102 ; MA # ( ᇿ → ᄂᄂ ) HANGUL JONGSEONG SSANGNIEUN → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG NIEUN # →ᆫᆫ→ + +1115 ; 1102 1103 ; MA # ( ᄕ → ᄂᄃ ) HANGUL CHOSEONG NIEUN-TIKEUT → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG TIKEUT # +3166 ; 1102 1103 ; MA # ( ㅦ → ᄂᄃ ) HANGUL LETTER NIEUN-TIKEUT → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG TIKEUT # →ᄕ→ +11C6 ; 1102 1103 ; MA # ( ᇆ → ᄂᄃ ) HANGUL JONGSEONG NIEUN-TIKEUT → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG TIKEUT # →ᄕ→ + +D7CB ; 1102 1105 ; MA # ( ퟋ → ᄂᄅ ) HANGUL JONGSEONG NIEUN-RIEUL → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG RIEUL # →ᆫᆯ→ + +1116 ; 1102 1107 ; MA # ( ᄖ → ᄂᄇ ) HANGUL CHOSEONG NIEUN-PIEUP → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG PIEUP # + +115B ; 1102 1109 ; MA # ( ᅛ → ᄂᄉ ) HANGUL CHOSEONG NIEUN-SIOS → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG SIOS # +11C7 ; 1102 1109 ; MA # ( ᇇ → ᄂᄉ ) HANGUL JONGSEONG NIEUN-SIOS → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG SIOS # →ᆫᆺ→ +3167 ; 1102 1109 ; MA # ( ㅧ → ᄂᄉ ) HANGUL LETTER NIEUN-SIOS → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG SIOS # →ᇇ→→ᆫᆺ→ + +115C ; 1102 110C ; MA # ( ᅜ → ᄂᄌ ) HANGUL CHOSEONG NIEUN-CIEUC → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG CIEUC # +11AC ; 1102 110C ; MA # ( ᆬ → ᄂᄌ ) HANGUL JONGSEONG NIEUN-CIEUC → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG CIEUC # →ᆫᆽ→ +3135 ; 1102 110C ; MA # ( ㄵ → ᄂᄌ ) HANGUL LETTER NIEUN-CIEUC → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG CIEUC # →ᆬ→→ᆫᆽ→ + +D7CC ; 1102 110E ; MA # ( ퟌ → ᄂᄎ ) HANGUL JONGSEONG NIEUN-CHIEUCH → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG CHIEUCH # →ᆫᆾ→ + +11C9 ; 1102 1110 ; MA # ( ᇉ → ᄂᄐ ) HANGUL JONGSEONG NIEUN-THIEUTH → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG THIEUTH # →ᆫᇀ→ + +115D ; 1102 1112 ; MA # ( ᅝ → ᄂᄒ ) HANGUL CHOSEONG NIEUN-HIEUH → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG HIEUH # +11AD ; 1102 1112 ; MA # ( ᆭ → ᄂᄒ ) HANGUL JONGSEONG NIEUN-HIEUH → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG HIEUH # →ᆫᇂ→ +3136 ; 1102 1112 ; MA # ( ㄶ → ᄂᄒ ) HANGUL LETTER NIEUN-HIEUH → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG HIEUH # →ᆭ→→ᆫᇂ→ + +11C8 ; 1102 1140 ; MA # ( ᇈ → ᄂᅀ ) HANGUL JONGSEONG NIEUN-PANSIOS → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG PANSIOS # →ᆫᇫ→ +3168 ; 1102 1140 ; MA # ( ㅨ → ᄂᅀ ) HANGUL LETTER NIEUN-PANSIOS → HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG PANSIOS # →ᇈ→→ᆫᇫ→ + +723F ; 1102 116E 4E28 ; MA # ( 爿 → 누丨 ) CJK UNIFIED IDEOGRAPH-723F → HANGUL CHOSEONG NIEUN, HANGUL JUNGSEONG U, CJK UNIFIED IDEOGRAPH-4E28 # →뉘→→누ᅵ→ +2F59 ; 1102 116E 4E28 ; MA #* ( ⽙ → 누丨 ) KANGXI RADICAL HALF TREE TRUNK → HANGUL CHOSEONG NIEUN, HANGUL JUNGSEONG U, CJK UNIFIED IDEOGRAPH-4E28 # →爿→→뉘→→누ᅵ→ + +3137 ; 1103 ; MA # ( ㄷ → ᄃ ) HANGUL LETTER TIKEUT → HANGUL CHOSEONG TIKEUT # +11AE ; 1103 ; MA # ( ᆮ → ᄃ ) HANGUL JONGSEONG TIKEUT → HANGUL CHOSEONG TIKEUT # + +1117 ; 1103 1100 ; MA # ( ᄗ → ᄃᄀ ) HANGUL CHOSEONG TIKEUT-KIYEOK → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG KIYEOK # +11CA ; 1103 1100 ; MA # ( ᇊ → ᄃᄀ ) HANGUL JONGSEONG TIKEUT-KIYEOK → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG KIYEOK # →ᄗ→ + +1104 ; 1103 1103 ; MA # ( ᄄ → ᄃᄃ ) HANGUL CHOSEONG SSANGTIKEUT → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG TIKEUT # +3138 ; 1103 1103 ; MA # ( ㄸ → ᄃᄃ ) HANGUL LETTER SSANGTIKEUT → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG TIKEUT # →ᄄ→ +D7CD ; 1103 1103 ; MA # ( ퟍ → ᄃᄃ ) HANGUL JONGSEONG SSANGTIKEUT → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG TIKEUT # →ᆮᆮ→ + +D7CE ; 1103 1103 1107 ; MA # ( ퟎ → ᄃᄃᄇ ) HANGUL JONGSEONG SSANGTIKEUT-PIEUP → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG PIEUP # →ᆮᆮᆸ→ + +115E ; 1103 1105 ; MA # ( ᅞ → ᄃᄅ ) HANGUL CHOSEONG TIKEUT-RIEUL → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG RIEUL # +11CB ; 1103 1105 ; MA # ( ᇋ → ᄃᄅ ) HANGUL JONGSEONG TIKEUT-RIEUL → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG RIEUL # →ᆮᆯ→ + +A960 ; 1103 1106 ; MA # ( ꥠ → ᄃᄆ ) HANGUL CHOSEONG TIKEUT-MIEUM → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG MIEUM # + +A961 ; 1103 1107 ; MA # ( ꥡ → ᄃᄇ ) HANGUL CHOSEONG TIKEUT-PIEUP → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG PIEUP # +D7CF ; 1103 1107 ; MA # ( ퟏ → ᄃᄇ ) HANGUL JONGSEONG TIKEUT-PIEUP → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG PIEUP # →ᆮᆸ→ + +A962 ; 1103 1109 ; MA # ( ꥢ → ᄃᄉ ) HANGUL CHOSEONG TIKEUT-SIOS → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG SIOS # +D7D0 ; 1103 1109 ; MA # ( ퟐ → ᄃᄉ ) HANGUL JONGSEONG TIKEUT-SIOS → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG SIOS # →ᆮᆺ→ + +D7D1 ; 1103 1109 1100 ; MA # ( ퟑ → ᄃᄉᄀ ) HANGUL JONGSEONG TIKEUT-SIOS-KIYEOK → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG KIYEOK # →ᆮᆺᆨ→ + +A963 ; 1103 110C ; MA # ( ꥣ → ᄃᄌ ) HANGUL CHOSEONG TIKEUT-CIEUC → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG CIEUC # +D7D2 ; 1103 110C ; MA # ( ퟒ → ᄃᄌ ) HANGUL JONGSEONG TIKEUT-CIEUC → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG CIEUC # →ᆮᆽ→ + +D7D3 ; 1103 110E ; MA # ( ퟓ → ᄃᄎ ) HANGUL JONGSEONG TIKEUT-CHIEUCH → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG CHIEUCH # →ᆮᆾ→ + +D7D4 ; 1103 1110 ; MA # ( ퟔ → ᄃᄐ ) HANGUL JONGSEONG TIKEUT-THIEUTH → HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG THIEUTH # →ᆮᇀ→ + +3139 ; 1105 ; MA # ( ㄹ → ᄅ ) HANGUL LETTER RIEUL → HANGUL CHOSEONG RIEUL # +11AF ; 1105 ; MA # ( ᆯ → ᄅ ) HANGUL JONGSEONG RIEUL → HANGUL CHOSEONG RIEUL # + +A964 ; 1105 1100 ; MA # ( ꥤ → ᄅᄀ ) HANGUL CHOSEONG RIEUL-KIYEOK → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG KIYEOK # +11B0 ; 1105 1100 ; MA # ( ᆰ → ᄅᄀ ) HANGUL JONGSEONG RIEUL-KIYEOK → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG KIYEOK # →ᆯᆨ→ +313A ; 1105 1100 ; MA # ( ㄺ → ᄅᄀ ) HANGUL LETTER RIEUL-KIYEOK → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG KIYEOK # →ᆰ→→ᆯᆨ→ + +A965 ; 1105 1100 1100 ; MA # ( ꥥ → ᄅᄀᄀ ) HANGUL CHOSEONG RIEUL-SSANGKIYEOK → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG KIYEOK # +D7D5 ; 1105 1100 1100 ; MA # ( ퟕ → ᄅᄀᄀ ) HANGUL JONGSEONG RIEUL-SSANGKIYEOK → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG KIYEOK # →ᆯᆨᆨ→ + +11CC ; 1105 1100 1109 ; MA # ( ᇌ → ᄅᄀᄉ ) HANGUL JONGSEONG RIEUL-KIYEOK-SIOS → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG SIOS # →ᆯᆨᆺ→ +3169 ; 1105 1100 1109 ; MA # ( ㅩ → ᄅᄀᄉ ) HANGUL LETTER RIEUL-KIYEOK-SIOS → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG SIOS # →ᇌ→→ᆯᆨᆺ→ + +D7D6 ; 1105 1100 1112 ; MA # ( ퟖ → ᄅᄀᄒ ) HANGUL JONGSEONG RIEUL-KIYEOK-HIEUH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG HIEUH # →ᆯᆨᇂ→ + +1118 ; 1105 1102 ; MA # ( ᄘ → ᄅᄂ ) HANGUL CHOSEONG RIEUL-NIEUN → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG NIEUN # +11CD ; 1105 1102 ; MA # ( ᇍ → ᄅᄂ ) HANGUL JONGSEONG RIEUL-NIEUN → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG NIEUN # →ᄘ→ + +A966 ; 1105 1103 ; MA # ( ꥦ → ᄅᄃ ) HANGUL CHOSEONG RIEUL-TIKEUT → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG TIKEUT # +11CE ; 1105 1103 ; MA # ( ᇎ → ᄅᄃ ) HANGUL JONGSEONG RIEUL-TIKEUT → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG TIKEUT # →ᆯᆮ→ +316A ; 1105 1103 ; MA # ( ㅪ → ᄅᄃ ) HANGUL LETTER RIEUL-TIKEUT → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG TIKEUT # →ᇎ→→ᆯᆮ→ + +A967 ; 1105 1103 1103 ; MA # ( ꥧ → ᄅᄃᄃ ) HANGUL CHOSEONG RIEUL-SSANGTIKEUT → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG TIKEUT # + +11CF ; 1105 1103 1112 ; MA # ( ᇏ → ᄅᄃᄒ ) HANGUL JONGSEONG RIEUL-TIKEUT-HIEUH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG TIKEUT, HANGUL CHOSEONG HIEUH # →ᆯᆮᇂ→ + +1119 ; 1105 1105 ; MA # ( ᄙ → ᄅᄅ ) HANGUL CHOSEONG SSANGRIEUL → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG RIEUL # +11D0 ; 1105 1105 ; MA # ( ᇐ → ᄅᄅ ) HANGUL JONGSEONG SSANGRIEUL → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG RIEUL # →ᄙ→ + +D7D7 ; 1105 1105 110F ; MA # ( ퟗ → ᄅᄅᄏ ) HANGUL JONGSEONG SSANGRIEUL-KHIEUKH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG KHIEUKH # →ᆯᆯᆿ→ + +A968 ; 1105 1106 ; MA # ( ꥨ → ᄅᄆ ) HANGUL CHOSEONG RIEUL-MIEUM → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG MIEUM # +11B1 ; 1105 1106 ; MA # ( ᆱ → ᄅᄆ ) HANGUL JONGSEONG RIEUL-MIEUM → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG MIEUM # →ᆯᆷ→ +313B ; 1105 1106 ; MA # ( ㄻ → ᄅᄆ ) HANGUL LETTER RIEUL-MIEUM → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG MIEUM # →ᆱ→→ᆯᆷ→ + +11D1 ; 1105 1106 1100 ; MA # ( ᇑ → ᄅᄆᄀ ) HANGUL JONGSEONG RIEUL-MIEUM-KIYEOK → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG KIYEOK # →ᆯᆷᆨ→ + +11D2 ; 1105 1106 1109 ; MA # ( ᇒ → ᄅᄆᄉ ) HANGUL JONGSEONG RIEUL-MIEUM-SIOS → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG SIOS # →ᆯᆷᆺ→ + +D7D8 ; 1105 1106 1112 ; MA # ( ퟘ → ᄅᄆᄒ ) HANGUL JONGSEONG RIEUL-MIEUM-HIEUH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG HIEUH # →ᆯᆷᇂ→ + +A969 ; 1105 1107 ; MA # ( ꥩ → ᄅᄇ ) HANGUL CHOSEONG RIEUL-PIEUP → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG PIEUP # +11B2 ; 1105 1107 ; MA # ( ᆲ → ᄅᄇ ) HANGUL JONGSEONG RIEUL-PIEUP → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG PIEUP # →ᆯᆸ→ +313C ; 1105 1107 ; MA # ( ㄼ → ᄅᄇ ) HANGUL LETTER RIEUL-PIEUP → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG PIEUP # →ᆲ→→ᆯᆸ→ + +D7D9 ; 1105 1107 1103 ; MA # ( ퟙ → ᄅᄇᄃ ) HANGUL JONGSEONG RIEUL-PIEUP-TIKEUT → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG TIKEUT # →ᆯᆸᆮ→ + +A96A ; 1105 1107 1107 ; MA # ( ꥪ → ᄅᄇᄇ ) HANGUL CHOSEONG RIEUL-SSANGPIEUP → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG PIEUP # + +11D3 ; 1105 1107 1109 ; MA # ( ᇓ → ᄅᄇᄉ ) HANGUL JONGSEONG RIEUL-PIEUP-SIOS → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG SIOS # →ᆯᆸᆺ→ +316B ; 1105 1107 1109 ; MA # ( ㅫ → ᄅᄇᄉ ) HANGUL LETTER RIEUL-PIEUP-SIOS → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG SIOS # →ᇓ→→ᆯᆸᆺ→ + +A96B ; 1105 1107 110B ; MA # ( ꥫ → ᄅᄇᄋ ) HANGUL CHOSEONG RIEUL-KAPYEOUNPIEUP → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG IEUNG # +11D5 ; 1105 1107 110B ; MA # ( ᇕ → ᄅᄇᄋ ) HANGUL JONGSEONG RIEUL-KAPYEOUNPIEUP → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG IEUNG # →ᆯᆸᆼ→ + +D7DA ; 1105 1107 1111 ; MA # ( ퟚ → ᄅᄇᄑ ) HANGUL JONGSEONG RIEUL-PIEUP-PHIEUPH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG PHIEUPH # →ᆯᆸᇁ→ + +11D4 ; 1105 1107 1112 ; MA # ( ᇔ → ᄅᄇᄒ ) HANGUL JONGSEONG RIEUL-PIEUP-HIEUH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG HIEUH # →ᆯᆸᇂ→ + +A96C ; 1105 1109 ; MA # ( ꥬ → ᄅᄉ ) HANGUL CHOSEONG RIEUL-SIOS → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG SIOS # +11B3 ; 1105 1109 ; MA # ( ᆳ → ᄅᄉ ) HANGUL JONGSEONG RIEUL-SIOS → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG SIOS # →ᆯᆺ→ +313D ; 1105 1109 ; MA # ( ㄽ → ᄅᄉ ) HANGUL LETTER RIEUL-SIOS → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG SIOS # →ᆳ→→ᆯᆺ→ + +11D6 ; 1105 1109 1109 ; MA # ( ᇖ → ᄅᄉᄉ ) HANGUL JONGSEONG RIEUL-SSANGSIOS → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG SIOS # →ᆯᆺᆺ→ + +111B ; 1105 110B ; MA # ( ᄛ → ᄅᄋ ) HANGUL CHOSEONG KAPYEOUNRIEUL → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG IEUNG # +D7DD ; 1105 110B ; MA # ( ퟝ → ᄅᄋ ) HANGUL JONGSEONG KAPYEOUNRIEUL → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG IEUNG # →ᆯᆼ→ + +A96D ; 1105 110C ; MA # ( ꥭ → ᄅᄌ ) HANGUL CHOSEONG RIEUL-CIEUC → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG CIEUC # + +A96E ; 1105 110F ; MA # ( ꥮ → ᄅᄏ ) HANGUL CHOSEONG RIEUL-KHIEUKH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG KHIEUKH # +11D8 ; 1105 110F ; MA # ( ᇘ → ᄅᄏ ) HANGUL JONGSEONG RIEUL-KHIEUKH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG KHIEUKH # →ᆯᆿ→ + +11B4 ; 1105 1110 ; MA # ( ᆴ → ᄅᄐ ) HANGUL JONGSEONG RIEUL-THIEUTH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG THIEUTH # →ᆯᇀ→ +313E ; 1105 1110 ; MA # ( ㄾ → ᄅᄐ ) HANGUL LETTER RIEUL-THIEUTH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG THIEUTH # →ᆴ→→ᆯᇀ→ + +11B5 ; 1105 1111 ; MA # ( ᆵ → ᄅᄑ ) HANGUL JONGSEONG RIEUL-PHIEUPH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG PHIEUPH # →ᆯᇁ→ +313F ; 1105 1111 ; MA # ( ㄿ → ᄅᄑ ) HANGUL LETTER RIEUL-PHIEUPH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG PHIEUPH # →ᆵ→→ᆯᇁ→ + +111A ; 1105 1112 ; MA # ( ᄚ → ᄅᄒ ) HANGUL CHOSEONG RIEUL-HIEUH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG HIEUH # +3140 ; 1105 1112 ; MA # ( ㅀ → ᄅᄒ ) HANGUL LETTER RIEUL-HIEUH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG HIEUH # →ᄚ→ +113B ; 1105 1112 ; MA # ( ᄻ → ᄅᄒ ) HANGUL CHOSEONG SIOS-HIEUH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG HIEUH # →ᄚ→ +11B6 ; 1105 1112 ; MA # ( ᆶ → ᄅᄒ ) HANGUL JONGSEONG RIEUL-HIEUH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG HIEUH # →ᄚ→ +D7F2 ; 1105 1112 ; MA # ( ퟲ → ᄅᄒ ) HANGUL JONGSEONG SIOS-HIEUH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG HIEUH # →ᆺᇂ→→ᄉᄒ→→ᄻ→→ᄚ→ + +11D7 ; 1105 1140 ; MA # ( ᇗ → ᄅᅀ ) HANGUL JONGSEONG RIEUL-PANSIOS → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG PANSIOS # →ᆯᇫ→ +316C ; 1105 1140 ; MA # ( ㅬ → ᄅᅀ ) HANGUL LETTER RIEUL-PANSIOS → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG PANSIOS # →ᇗ→→ᆯᇫ→ + +D7DB ; 1105 114C ; MA # ( ퟛ → ᄅᅌ ) HANGUL JONGSEONG RIEUL-YESIEUNG → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG YESIEUNG # →ᆯᇰ→ + +11D9 ; 1105 1159 ; MA # ( ᇙ → ᄅᅙ ) HANGUL JONGSEONG RIEUL-YEORINHIEUH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG YEORINHIEUH # →ᆯᇹ→ +316D ; 1105 1159 ; MA # ( ㅭ → ᄅᅙ ) HANGUL LETTER RIEUL-YEORINHIEUH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG YEORINHIEUH # →ᇙ→→ᆯᇹ→ + +D7DC ; 1105 1159 1112 ; MA # ( ퟜ → ᄅᅙᄒ ) HANGUL JONGSEONG RIEUL-YEORINHIEUH-HIEUH → HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG YEORINHIEUH, HANGUL CHOSEONG HIEUH # →ᆯᇹᇂ→ + +3141 ; 1106 ; MA # ( ㅁ → ᄆ ) HANGUL LETTER MIEUM → HANGUL CHOSEONG MIEUM # +11B7 ; 1106 ; MA # ( ᆷ → ᄆ ) HANGUL JONGSEONG MIEUM → HANGUL CHOSEONG MIEUM # + +A96F ; 1106 1100 ; MA # ( ꥯ → ᄆᄀ ) HANGUL CHOSEONG MIEUM-KIYEOK → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG KIYEOK # +11DA ; 1106 1100 ; MA # ( ᇚ → ᄆᄀ ) HANGUL JONGSEONG MIEUM-KIYEOK → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG KIYEOK # →ᆷᆨ→ + +D7DE ; 1106 1102 ; MA # ( ퟞ → ᄆᄂ ) HANGUL JONGSEONG MIEUM-NIEUN → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG NIEUN # →ᆷᆫ→ + +D7DF ; 1106 1102 1102 ; MA # ( ퟟ → ᄆᄂᄂ ) HANGUL JONGSEONG MIEUM-SSANGNIEUN → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG NIEUN # →ᆷᆫᆫ→ + +A970 ; 1106 1103 ; MA # ( ꥰ → ᄆᄃ ) HANGUL CHOSEONG MIEUM-TIKEUT → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG TIKEUT # + +11DB ; 1106 1105 ; MA # ( ᇛ → ᄆᄅ ) HANGUL JONGSEONG MIEUM-RIEUL → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG RIEUL # →ᆷᆯ→ + +D7E0 ; 1106 1106 ; MA # ( ퟠ → ᄆᄆ ) HANGUL JONGSEONG SSANGMIEUM → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG MIEUM # →ᆷᆷ→ + +111C ; 1106 1107 ; MA # ( ᄜ → ᄆᄇ ) HANGUL CHOSEONG MIEUM-PIEUP → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG PIEUP # +316E ; 1106 1107 ; MA # ( ㅮ → ᄆᄇ ) HANGUL LETTER MIEUM-PIEUP → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG PIEUP # →ᄜ→ +11DC ; 1106 1107 ; MA # ( ᇜ → ᄆᄇ ) HANGUL JONGSEONG MIEUM-PIEUP → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG PIEUP # →ᄜ→ + +D7E1 ; 1106 1107 1109 ; MA # ( ퟡ → ᄆᄇᄉ ) HANGUL JONGSEONG MIEUM-PIEUP-SIOS → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG SIOS # →ᆷᆸᆺ→ + +A971 ; 1106 1109 ; MA # ( ꥱ → ᄆᄉ ) HANGUL CHOSEONG MIEUM-SIOS → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG SIOS # +11DD ; 1106 1109 ; MA # ( ᇝ → ᄆᄉ ) HANGUL JONGSEONG MIEUM-SIOS → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG SIOS # →ᆷᆺ→ +316F ; 1106 1109 ; MA # ( ㅯ → ᄆᄉ ) HANGUL LETTER MIEUM-SIOS → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG SIOS # →ᇝ→→ᆷᆺ→ + +11DE ; 1106 1109 1109 ; MA # ( ᇞ → ᄆᄉᄉ ) HANGUL JONGSEONG MIEUM-SSANGSIOS → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG SIOS # →ᆷᆺᆺ→ + +111D ; 1106 110B ; MA # ( ᄝ → ᄆᄋ ) HANGUL CHOSEONG KAPYEOUNMIEUM → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG IEUNG # +3171 ; 1106 110B ; MA # ( ㅱ → ᄆᄋ ) HANGUL LETTER KAPYEOUNMIEUM → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG IEUNG # →ᄝ→ +11E2 ; 1106 110B ; MA # ( ᇢ → ᄆᄋ ) HANGUL JONGSEONG KAPYEOUNMIEUM → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG IEUNG # →ᄝ→ + +D7E2 ; 1106 110C ; MA # ( ퟢ → ᄆᄌ ) HANGUL JONGSEONG MIEUM-CIEUC → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG CIEUC # →ᆷᆽ→ + +11E0 ; 1106 110E ; MA # ( ᇠ → ᄆᄎ ) HANGUL JONGSEONG MIEUM-CHIEUCH → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG CHIEUCH # →ᆷᆾ→ + +11E1 ; 1106 1112 ; MA # ( ᇡ → ᄆᄒ ) HANGUL JONGSEONG MIEUM-HIEUH → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG HIEUH # →ᆷᇂ→ + +11DF ; 1106 1140 ; MA # ( ᇟ → ᄆᅀ ) HANGUL JONGSEONG MIEUM-PANSIOS → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG PANSIOS # →ᆷᇫ→ +3170 ; 1106 1140 ; MA # ( ㅰ → ᄆᅀ ) HANGUL LETTER MIEUM-PANSIOS → HANGUL CHOSEONG MIEUM, HANGUL CHOSEONG PANSIOS # →ᇟ→→ᆷᇫ→ + +535F ; 1106 1161 ; MA # ( 卟 → 마 ) CJK UNIFIED IDEOGRAPH-535F → HANGUL CHOSEONG MIEUM, HANGUL JUNGSEONG A # + +3142 ; 1107 ; MA # ( ㅂ → ᄇ ) HANGUL LETTER PIEUP → HANGUL CHOSEONG PIEUP # +11B8 ; 1107 ; MA # ( ᆸ → ᄇ ) HANGUL JONGSEONG PIEUP → HANGUL CHOSEONG PIEUP # + +111E ; 1107 1100 ; MA # ( ᄞ → ᄇᄀ ) HANGUL CHOSEONG PIEUP-KIYEOK → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG KIYEOK # +3172 ; 1107 1100 ; MA # ( ㅲ → ᄇᄀ ) HANGUL LETTER PIEUP-KIYEOK → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG KIYEOK # →ᄞ→ + +111F ; 1107 1102 ; MA # ( ᄟ → ᄇᄂ ) HANGUL CHOSEONG PIEUP-NIEUN → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG NIEUN # + +1120 ; 1107 1103 ; MA # ( ᄠ → ᄇᄃ ) HANGUL CHOSEONG PIEUP-TIKEUT → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG TIKEUT # +3173 ; 1107 1103 ; MA # ( ㅳ → ᄇᄃ ) HANGUL LETTER PIEUP-TIKEUT → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG TIKEUT # →ᄠ→ +D7E3 ; 1107 1103 ; MA # ( ퟣ → ᄇᄃ ) HANGUL JONGSEONG PIEUP-TIKEUT → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG TIKEUT # →ᆸᆮ→ + +11E3 ; 1107 1105 ; MA # ( ᇣ → ᄇᄅ ) HANGUL JONGSEONG PIEUP-RIEUL → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG RIEUL # →ᆸᆯ→ + +D7E4 ; 1107 1105 1111 ; MA # ( ퟤ → ᄇᄅᄑ ) HANGUL JONGSEONG PIEUP-RIEUL-PHIEUPH → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG RIEUL, HANGUL CHOSEONG PHIEUPH # →ᆸᆯᇁ→ + +D7E5 ; 1107 1106 ; MA # ( ퟥ → ᄇᄆ ) HANGUL JONGSEONG PIEUP-MIEUM → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG MIEUM # →ᆸᆷ→ + +1108 ; 1107 1107 ; MA # ( ᄈ → ᄇᄇ ) HANGUL CHOSEONG SSANGPIEUP → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG PIEUP # +3143 ; 1107 1107 ; MA # ( ㅃ → ᄇᄇ ) HANGUL LETTER SSANGPIEUP → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG PIEUP # →ᄈ→ +D7E6 ; 1107 1107 ; MA # ( ퟦ → ᄇᄇ ) HANGUL JONGSEONG SSANGPIEUP → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG PIEUP # →ᆸᆸ→ + +112C ; 1107 1107 110B ; MA # ( ᄬ → ᄇᄇᄋ ) HANGUL CHOSEONG KAPYEOUNSSANGPIEUP → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG IEUNG # +3179 ; 1107 1107 110B ; MA # ( ㅹ → ᄇᄇᄋ ) HANGUL LETTER KAPYEOUNSSANGPIEUP → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG IEUNG # →ᄬ→ + +1121 ; 1107 1109 ; MA # ( ᄡ → ᄇᄉ ) HANGUL CHOSEONG PIEUP-SIOS → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG SIOS # +3144 ; 1107 1109 ; MA # ( ㅄ → ᄇᄉ ) HANGUL LETTER PIEUP-SIOS → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG SIOS # →ᄡ→ +11B9 ; 1107 1109 ; MA # ( ᆹ → ᄇᄉ ) HANGUL JONGSEONG PIEUP-SIOS → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG SIOS # →ᄡ→ + +1122 ; 1107 1109 1100 ; MA # ( ᄢ → ᄇᄉᄀ ) HANGUL CHOSEONG PIEUP-SIOS-KIYEOK → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG KIYEOK # +3174 ; 1107 1109 1100 ; MA # ( ㅴ → ᄇᄉᄀ ) HANGUL LETTER PIEUP-SIOS-KIYEOK → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG KIYEOK # →ᄢ→ + +1123 ; 1107 1109 1103 ; MA # ( ᄣ → ᄇᄉᄃ ) HANGUL CHOSEONG PIEUP-SIOS-TIKEUT → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG TIKEUT # +3175 ; 1107 1109 1103 ; MA # ( ㅵ → ᄇᄉᄃ ) HANGUL LETTER PIEUP-SIOS-TIKEUT → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG TIKEUT # →ᄣ→ +D7E7 ; 1107 1109 1103 ; MA # ( ퟧ → ᄇᄉᄃ ) HANGUL JONGSEONG PIEUP-SIOS-TIKEUT → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG TIKEUT # →ᆸᆺᆮ→ + +1124 ; 1107 1109 1107 ; MA # ( ᄤ → ᄇᄉᄇ ) HANGUL CHOSEONG PIEUP-SIOS-PIEUP → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG PIEUP # + +1125 ; 1107 1109 1109 ; MA # ( ᄥ → ᄇᄉᄉ ) HANGUL CHOSEONG PIEUP-SSANGSIOS → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG SIOS # + +1126 ; 1107 1109 110C ; MA # ( ᄦ → ᄇᄉᄌ ) HANGUL CHOSEONG PIEUP-SIOS-CIEUC → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG CIEUC # + +A972 ; 1107 1109 1110 ; MA # ( ꥲ → ᄇᄉᄐ ) HANGUL CHOSEONG PIEUP-SIOS-THIEUTH → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG THIEUTH # + +112B ; 1107 110B ; MA # ( ᄫ → ᄇᄋ ) HANGUL CHOSEONG KAPYEOUNPIEUP → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG IEUNG # +3178 ; 1107 110B ; MA # ( ㅸ → ᄇᄋ ) HANGUL LETTER KAPYEOUNPIEUP → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG IEUNG # →ᄫ→ +11E6 ; 1107 110B ; MA # ( ᇦ → ᄇᄋ ) HANGUL JONGSEONG KAPYEOUNPIEUP → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG IEUNG # →ᄫ→ + +1127 ; 1107 110C ; MA # ( ᄧ → ᄇᄌ ) HANGUL CHOSEONG PIEUP-CIEUC → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG CIEUC # +3176 ; 1107 110C ; MA # ( ㅶ → ᄇᄌ ) HANGUL LETTER PIEUP-CIEUC → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG CIEUC # →ᄧ→ +D7E8 ; 1107 110C ; MA # ( ퟨ → ᄇᄌ ) HANGUL JONGSEONG PIEUP-CIEUC → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG CIEUC # →ᆸᆽ→ + +1128 ; 1107 110E ; MA # ( ᄨ → ᄇᄎ ) HANGUL CHOSEONG PIEUP-CHIEUCH → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG CHIEUCH # +D7E9 ; 1107 110E ; MA # ( ퟩ → ᄇᄎ ) HANGUL JONGSEONG PIEUP-CHIEUCH → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG CHIEUCH # →ᆸᆾ→ + +A973 ; 1107 110F ; MA # ( ꥳ → ᄇᄏ ) HANGUL CHOSEONG PIEUP-KHIEUKH → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG KHIEUKH # + +1129 ; 1107 1110 ; MA # ( ᄩ → ᄇᄐ ) HANGUL CHOSEONG PIEUP-THIEUTH → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG THIEUTH # +3177 ; 1107 1110 ; MA # ( ㅷ → ᄇᄐ ) HANGUL LETTER PIEUP-THIEUTH → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG THIEUTH # →ᄩ→ + +112A ; 1107 1111 ; MA # ( ᄪ → ᄇᄑ ) HANGUL CHOSEONG PIEUP-PHIEUPH → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG PHIEUPH # +11E4 ; 1107 1111 ; MA # ( ᇤ → ᄇᄑ ) HANGUL JONGSEONG PIEUP-PHIEUPH → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG PHIEUPH # →ᄪ→ + +A974 ; 1107 1112 ; MA # ( ꥴ → ᄇᄒ ) HANGUL CHOSEONG PIEUP-HIEUH → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG HIEUH # +11E5 ; 1107 1112 ; MA # ( ᇥ → ᄇᄒ ) HANGUL JONGSEONG PIEUP-HIEUH → HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG HIEUH # →ᆸᇂ→ + +3145 ; 1109 ; MA # ( ㅅ → ᄉ ) HANGUL LETTER SIOS → HANGUL CHOSEONG SIOS # +11BA ; 1109 ; MA # ( ᆺ → ᄉ ) HANGUL JONGSEONG SIOS → HANGUL CHOSEONG SIOS # + +4ECA ; 1109 30FC 1100 ; MA # ( 今 → ᄉーᄀ ) CJK UNIFIED IDEOGRAPH-4ECA → HANGUL CHOSEONG SIOS, KATAKANA-HIRAGANA PROLONGED SOUND MARK, HANGUL CHOSEONG KIYEOK # →슥→→스ᄀ→ + +5408 ; 1109 30FC 1106 ; MA # ( 合 → ᄉーᄆ ) CJK UNIFIED IDEOGRAPH-5408 → HANGUL CHOSEONG SIOS, KATAKANA-HIRAGANA PROLONGED SOUND MARK, HANGUL CHOSEONG MIEUM # →슴→→스ᄆ→ + +112D ; 1109 1100 ; MA # ( ᄭ → ᄉᄀ ) HANGUL CHOSEONG SIOS-KIYEOK → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG KIYEOK # +317A ; 1109 1100 ; MA # ( ㅺ → ᄉᄀ ) HANGUL LETTER SIOS-KIYEOK → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG KIYEOK # →ᄭ→ +11E7 ; 1109 1100 ; MA # ( ᇧ → ᄉᄀ ) HANGUL JONGSEONG SIOS-KIYEOK → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG KIYEOK # →ᄭ→ + +112E ; 1109 1102 ; MA # ( ᄮ → ᄉᄂ ) HANGUL CHOSEONG SIOS-NIEUN → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG NIEUN # +317B ; 1109 1102 ; MA # ( ㅻ → ᄉᄂ ) HANGUL LETTER SIOS-NIEUN → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG NIEUN # →ᄮ→ + +112F ; 1109 1103 ; MA # ( ᄯ → ᄉᄃ ) HANGUL CHOSEONG SIOS-TIKEUT → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG TIKEUT # +317C ; 1109 1103 ; MA # ( ㅼ → ᄉᄃ ) HANGUL LETTER SIOS-TIKEUT → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG TIKEUT # →ᄯ→ +11E8 ; 1109 1103 ; MA # ( ᇨ → ᄉᄃ ) HANGUL JONGSEONG SIOS-TIKEUT → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG TIKEUT # →ᄯ→ + +1130 ; 1109 1105 ; MA # ( ᄰ → ᄉᄅ ) HANGUL CHOSEONG SIOS-RIEUL → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG RIEUL # +11E9 ; 1109 1105 ; MA # ( ᇩ → ᄉᄅ ) HANGUL JONGSEONG SIOS-RIEUL → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG RIEUL # →ᄰ→ + +1131 ; 1109 1106 ; MA # ( ᄱ → ᄉᄆ ) HANGUL CHOSEONG SIOS-MIEUM → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG MIEUM # +D7EA ; 1109 1106 ; MA # ( ퟪ → ᄉᄆ ) HANGUL JONGSEONG SIOS-MIEUM → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG MIEUM # →ᆺᆷ→ + +1132 ; 1109 1107 ; MA # ( ᄲ → ᄉᄇ ) HANGUL CHOSEONG SIOS-PIEUP → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG PIEUP # +317D ; 1109 1107 ; MA # ( ㅽ → ᄉᄇ ) HANGUL LETTER SIOS-PIEUP → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG PIEUP # →ᄲ→ +11EA ; 1109 1107 ; MA # ( ᇪ → ᄉᄇ ) HANGUL JONGSEONG SIOS-PIEUP → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG PIEUP # →ᄲ→ + +1133 ; 1109 1107 1100 ; MA # ( ᄳ → ᄉᄇᄀ ) HANGUL CHOSEONG SIOS-PIEUP-KIYEOK → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG KIYEOK # + +D7EB ; 1109 1107 110B ; MA # ( ퟫ → ᄉᄇᄋ ) HANGUL JONGSEONG SIOS-KAPYEOUNPIEUP → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG IEUNG # →ᆺᆸᆼ→ + +110A ; 1109 1109 ; MA # ( ᄊ → ᄉᄉ ) HANGUL CHOSEONG SSANGSIOS → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG SIOS # +3146 ; 1109 1109 ; MA # ( ㅆ → ᄉᄉ ) HANGUL LETTER SSANGSIOS → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG SIOS # →ᄊ→ +11BB ; 1109 1109 ; MA # ( ᆻ → ᄉᄉ ) HANGUL JONGSEONG SSANGSIOS → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG SIOS # →ᄊ→ + +4E1B ; 1109 1109 30FC ; MA # ( 丛 → ᄉᄉー ) CJK UNIFIED IDEOGRAPH-4E1B → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG SIOS, KATAKANA-HIRAGANA PROLONGED SOUND MARK # →쓰→→ᄉ스→ + +D7EC ; 1109 1109 1100 ; MA # ( ퟬ → ᄉᄉᄀ ) HANGUL JONGSEONG SSANGSIOS-KIYEOK → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG KIYEOK # →ᆺᆺᆨ→ + +D7ED ; 1109 1109 1103 ; MA # ( ퟭ → ᄉᄉᄃ ) HANGUL JONGSEONG SSANGSIOS-TIKEUT → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG TIKEUT # →ᆺᆺᆮ→ + +A975 ; 1109 1109 1107 ; MA # ( ꥵ → ᄉᄉᄇ ) HANGUL CHOSEONG SSANGSIOS-PIEUP → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG PIEUP # + +1134 ; 1109 1109 1109 ; MA # ( ᄴ → ᄉᄉᄉ ) HANGUL CHOSEONG SIOS-SSANGSIOS → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG SIOS, HANGUL CHOSEONG SIOS # + +1135 ; 1109 110B ; MA # ( ᄵ → ᄉᄋ ) HANGUL CHOSEONG SIOS-IEUNG → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG IEUNG # + +1136 ; 1109 110C ; MA # ( ᄶ → ᄉᄌ ) HANGUL CHOSEONG SIOS-CIEUC → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG CIEUC # +317E ; 1109 110C ; MA # ( ㅾ → ᄉᄌ ) HANGUL LETTER SIOS-CIEUC → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG CIEUC # →ᄶ→ +D7EF ; 1109 110C ; MA # ( ퟯ → ᄉᄌ ) HANGUL JONGSEONG SIOS-CIEUC → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG CIEUC # →ᆺᆽ→ + +1137 ; 1109 110E ; MA # ( ᄷ → ᄉᄎ ) HANGUL CHOSEONG SIOS-CHIEUCH → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG CHIEUCH # +D7F0 ; 1109 110E ; MA # ( ퟰ → ᄉᄎ ) HANGUL JONGSEONG SIOS-CHIEUCH → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG CHIEUCH # →ᆺᆾ→ + +1138 ; 1109 110F ; MA # ( ᄸ → ᄉᄏ ) HANGUL CHOSEONG SIOS-KHIEUKH → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG KHIEUKH # + +1139 ; 1109 1110 ; MA # ( ᄹ → ᄉᄐ ) HANGUL CHOSEONG SIOS-THIEUTH → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG THIEUTH # +D7F1 ; 1109 1110 ; MA # ( ퟱ → ᄉᄐ ) HANGUL JONGSEONG SIOS-THIEUTH → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG THIEUTH # →ᆺᇀ→ + +113A ; 1109 1111 ; MA # ( ᄺ → ᄉᄑ ) HANGUL CHOSEONG SIOS-PHIEUPH → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG PHIEUPH # + +D7EE ; 1109 1140 ; MA # ( ퟮ → ᄉᅀ ) HANGUL JONGSEONG SIOS-PANSIOS → HANGUL CHOSEONG SIOS, HANGUL CHOSEONG PANSIOS # →ᆺᇫ→ + +3147 ; 110B ; MA # ( ㅇ → ᄋ ) HANGUL LETTER IEUNG → HANGUL CHOSEONG IEUNG # +11BC ; 110B ; MA # ( ᆼ → ᄋ ) HANGUL JONGSEONG IEUNG → HANGUL CHOSEONG IEUNG # + +1141 ; 110B 1100 ; MA # ( ᅁ → ᄋᄀ ) HANGUL CHOSEONG IEUNG-KIYEOK → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG KIYEOK # +11EC ; 110B 1100 ; MA # ( ᇬ → ᄋᄀ ) HANGUL JONGSEONG IEUNG-KIYEOK → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG KIYEOK # →ᅁ→ + +11ED ; 110B 1100 1100 ; MA # ( ᇭ → ᄋᄀᄀ ) HANGUL JONGSEONG IEUNG-SSANGKIYEOK → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG KIYEOK, HANGUL CHOSEONG KIYEOK # →ᆼᆨᆨ→ + +1142 ; 110B 1103 ; MA # ( ᅂ → ᄋᄃ ) HANGUL CHOSEONG IEUNG-TIKEUT → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG TIKEUT # + +A976 ; 110B 1105 ; MA # ( ꥶ → ᄋᄅ ) HANGUL CHOSEONG IEUNG-RIEUL → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG RIEUL # + +1143 ; 110B 1106 ; MA # ( ᅃ → ᄋᄆ ) HANGUL CHOSEONG IEUNG-MIEUM → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG MIEUM # + +1144 ; 110B 1107 ; MA # ( ᅄ → ᄋᄇ ) HANGUL CHOSEONG IEUNG-PIEUP → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG PIEUP # + +1145 ; 110B 1109 ; MA # ( ᅅ → ᄋᄉ ) HANGUL CHOSEONG IEUNG-SIOS → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG SIOS # +11F1 ; 110B 1109 ; MA # ( ᇱ → ᄋᄉ ) HANGUL JONGSEONG YESIEUNG-SIOS → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG SIOS # →ᅅ→ +3182 ; 110B 1109 ; MA # ( ㆂ → ᄋᄉ ) HANGUL LETTER YESIEUNG-SIOS → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG SIOS # →ᇱ→→ᅅ→ + +1147 ; 110B 110B ; MA # ( ᅇ → ᄋᄋ ) HANGUL CHOSEONG SSANGIEUNG → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG IEUNG # +3180 ; 110B 110B ; MA # ( ㆀ → ᄋᄋ ) HANGUL LETTER SSANGIEUNG → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG IEUNG # →ᅇ→ +11EE ; 110B 110B ; MA # ( ᇮ → ᄋᄋ ) HANGUL JONGSEONG SSANGIEUNG → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG IEUNG # →ᅇ→ + +1148 ; 110B 110C ; MA # ( ᅈ → ᄋᄌ ) HANGUL CHOSEONG IEUNG-CIEUC → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG CIEUC # + +1149 ; 110B 110E ; MA # ( ᅉ → ᄋᄎ ) HANGUL CHOSEONG IEUNG-CHIEUCH → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG CHIEUCH # + +11EF ; 110B 110F ; MA # ( ᇯ → ᄋᄏ ) HANGUL JONGSEONG IEUNG-KHIEUKH → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG KHIEUKH # →ᆼᆿ→ + +114A ; 110B 1110 ; MA # ( ᅊ → ᄋᄐ ) HANGUL CHOSEONG IEUNG-THIEUTH → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG THIEUTH # + +114B ; 110B 1111 ; MA # ( ᅋ → ᄋᄑ ) HANGUL CHOSEONG IEUNG-PHIEUPH → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG PHIEUPH # + +A977 ; 110B 1112 ; MA # ( ꥷ → ᄋᄒ ) HANGUL CHOSEONG IEUNG-HIEUH → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG HIEUH # + +1146 ; 110B 1140 ; MA # ( ᅆ → ᄋᅀ ) HANGUL CHOSEONG IEUNG-PANSIOS → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG PANSIOS # +11F2 ; 110B 1140 ; MA # ( ᇲ → ᄋᅀ ) HANGUL JONGSEONG YESIEUNG-PANSIOS → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG PANSIOS # →ᅆ→ +3183 ; 110B 1140 ; MA # ( ㆃ → ᄋᅀ ) HANGUL LETTER YESIEUNG-PANSIOS → HANGUL CHOSEONG IEUNG, HANGUL CHOSEONG PANSIOS # →ᇲ→→ᅆ→ + +3148 ; 110C ; MA # ( ㅈ → ᄌ ) HANGUL LETTER CIEUC → HANGUL CHOSEONG CIEUC # +11BD ; 110C ; MA # ( ᆽ → ᄌ ) HANGUL JONGSEONG CIEUC → HANGUL CHOSEONG CIEUC # + +D7F7 ; 110C 1107 ; MA # ( ퟷ → ᄌᄇ ) HANGUL JONGSEONG CIEUC-PIEUP → HANGUL CHOSEONG CIEUC, HANGUL CHOSEONG PIEUP # →ᆽᆸ→ + +D7F8 ; 110C 1107 1107 ; MA # ( ퟸ → ᄌᄇᄇ ) HANGUL JONGSEONG CIEUC-SSANGPIEUP → HANGUL CHOSEONG CIEUC, HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG PIEUP # →ᆽᆸᆸ→ + +114D ; 110C 110B ; MA # ( ᅍ → ᄌᄋ ) HANGUL CHOSEONG CIEUC-IEUNG → HANGUL CHOSEONG CIEUC, HANGUL CHOSEONG IEUNG # + +110D ; 110C 110C ; MA # ( ᄍ → ᄌᄌ ) HANGUL CHOSEONG SSANGCIEUC → HANGUL CHOSEONG CIEUC, HANGUL CHOSEONG CIEUC # +3149 ; 110C 110C ; MA # ( ㅉ → ᄌᄌ ) HANGUL LETTER SSANGCIEUC → HANGUL CHOSEONG CIEUC, HANGUL CHOSEONG CIEUC # →ᄍ→ +D7F9 ; 110C 110C ; MA # ( ퟹ → ᄌᄌ ) HANGUL JONGSEONG SSANGCIEUC → HANGUL CHOSEONG CIEUC, HANGUL CHOSEONG CIEUC # →ᆽᆽ→ + +A978 ; 110C 110C 1112 ; MA # ( ꥸ → ᄌᄌᄒ ) HANGUL CHOSEONG SSANGCIEUC-HIEUH → HANGUL CHOSEONG CIEUC, HANGUL CHOSEONG CIEUC, HANGUL CHOSEONG HIEUH # + +4E15 ; 110C 1169 ; MA # ( 丕 → 조 ) CJK UNIFIED IDEOGRAPH-4E15 → HANGUL CHOSEONG CIEUC, HANGUL JUNGSEONG O # + +314A ; 110E ; MA # ( ㅊ → ᄎ ) HANGUL LETTER CHIEUCH → HANGUL CHOSEONG CHIEUCH # +11BE ; 110E ; MA # ( ᆾ → ᄎ ) HANGUL JONGSEONG CHIEUCH → HANGUL CHOSEONG CHIEUCH # + +1152 ; 110E 110F ; MA # ( ᅒ → ᄎᄏ ) HANGUL CHOSEONG CHIEUCH-KHIEUKH → HANGUL CHOSEONG CHIEUCH, HANGUL CHOSEONG KHIEUKH # + +1153 ; 110E 1112 ; MA # ( ᅓ → ᄎᄒ ) HANGUL CHOSEONG CHIEUCH-HIEUH → HANGUL CHOSEONG CHIEUCH, HANGUL CHOSEONG HIEUH # + +314B ; 110F ; MA # ( ㅋ → ᄏ ) HANGUL LETTER KHIEUKH → HANGUL CHOSEONG KHIEUKH # +11BF ; 110F ; MA # ( ᆿ → ᄏ ) HANGUL JONGSEONG KHIEUKH → HANGUL CHOSEONG KHIEUKH # + +314C ; 1110 ; MA # ( ㅌ → ᄐ ) HANGUL LETTER THIEUTH → HANGUL CHOSEONG THIEUTH # +11C0 ; 1110 ; MA # ( ᇀ → ᄐ ) HANGUL JONGSEONG THIEUTH → HANGUL CHOSEONG THIEUTH # + +9577 ; 1110 30FC 1102 110C ; MA # ( 長 → ᄐーᄂᄌ ) CJK UNIFIED IDEOGRAPH-9577 → HANGUL CHOSEONG THIEUTH, KATAKANA-HIRAGANA PROLONGED SOUND MARK, HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG CIEUC # →튽→→트ᄂᄌ→ +2ED1 ; 1110 30FC 1102 110C ; MA #* ( ⻑ → ᄐーᄂᄌ ) CJK RADICAL LONG ONE → HANGUL CHOSEONG THIEUTH, KATAKANA-HIRAGANA PROLONGED SOUND MARK, HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG CIEUC # →長→→튽→→트ᄂᄌ→ +2FA7 ; 1110 30FC 1102 110C ; MA #* ( ⾧ → ᄐーᄂᄌ ) KANGXI RADICAL LONG → HANGUL CHOSEONG THIEUTH, KATAKANA-HIRAGANA PROLONGED SOUND MARK, HANGUL CHOSEONG NIEUN, HANGUL CHOSEONG CIEUC # →長→→튽→→트ᄂᄌ→ + +A979 ; 1110 1110 ; MA # ( ꥹ → ᄐᄐ ) HANGUL CHOSEONG SSANGTHIEUTH → HANGUL CHOSEONG THIEUTH, HANGUL CHOSEONG THIEUTH # + +314D ; 1111 ; MA # ( ㅍ → ᄑ ) HANGUL LETTER PHIEUPH → HANGUL CHOSEONG PHIEUPH # +11C1 ; 1111 ; MA # ( ᇁ → ᄑ ) HANGUL JONGSEONG PHIEUPH → HANGUL CHOSEONG PHIEUPH # + +1156 ; 1111 1107 ; MA # ( ᅖ → ᄑᄇ ) HANGUL CHOSEONG PHIEUPH-PIEUP → HANGUL CHOSEONG PHIEUPH, HANGUL CHOSEONG PIEUP # +11F3 ; 1111 1107 ; MA # ( ᇳ → ᄑᄇ ) HANGUL JONGSEONG PHIEUPH-PIEUP → HANGUL CHOSEONG PHIEUPH, HANGUL CHOSEONG PIEUP # →ᅖ→ + +D7FA ; 1111 1109 ; MA # ( ퟺ → ᄑᄉ ) HANGUL JONGSEONG PHIEUPH-SIOS → HANGUL CHOSEONG PHIEUPH, HANGUL CHOSEONG SIOS # →ᇁᆺ→ + +1157 ; 1111 110B ; MA # ( ᅗ → ᄑᄋ ) HANGUL CHOSEONG KAPYEOUNPHIEUPH → HANGUL CHOSEONG PHIEUPH, HANGUL CHOSEONG IEUNG # +3184 ; 1111 110B ; MA # ( ㆄ → ᄑᄋ ) HANGUL LETTER KAPYEOUNPHIEUPH → HANGUL CHOSEONG PHIEUPH, HANGUL CHOSEONG IEUNG # →ᅗ→ +11F4 ; 1111 110B ; MA # ( ᇴ → ᄑᄋ ) HANGUL JONGSEONG KAPYEOUNPHIEUPH → HANGUL CHOSEONG PHIEUPH, HANGUL CHOSEONG IEUNG # →ᅗ→ + +D7FB ; 1111 1110 ; MA # ( ퟻ → ᄑᄐ ) HANGUL JONGSEONG PHIEUPH-THIEUTH → HANGUL CHOSEONG PHIEUPH, HANGUL CHOSEONG THIEUTH # →ᇁᇀ→ + +A97A ; 1111 1112 ; MA # ( ꥺ → ᄑᄒ ) HANGUL CHOSEONG PHIEUPH-HIEUH → HANGUL CHOSEONG PHIEUPH, HANGUL CHOSEONG HIEUH # + +314E ; 1112 ; MA # ( ㅎ → ᄒ ) HANGUL LETTER HIEUH → HANGUL CHOSEONG HIEUH # +11C2 ; 1112 ; MA # ( ᇂ → ᄒ ) HANGUL JONGSEONG HIEUH → HANGUL CHOSEONG HIEUH # + +11F5 ; 1112 1102 ; MA # ( ᇵ → ᄒᄂ ) HANGUL JONGSEONG HIEUH-NIEUN → HANGUL CHOSEONG HIEUH, HANGUL CHOSEONG NIEUN # →ᇂᆫ→ + +11F6 ; 1112 1105 ; MA # ( ᇶ → ᄒᄅ ) HANGUL JONGSEONG HIEUH-RIEUL → HANGUL CHOSEONG HIEUH, HANGUL CHOSEONG RIEUL # →ᇂᆯ→ + +11F7 ; 1112 1106 ; MA # ( ᇷ → ᄒᄆ ) HANGUL JONGSEONG HIEUH-MIEUM → HANGUL CHOSEONG HIEUH, HANGUL CHOSEONG MIEUM # →ᇂᆷ→ + +11F8 ; 1112 1107 ; MA # ( ᇸ → ᄒᄇ ) HANGUL JONGSEONG HIEUH-PIEUP → HANGUL CHOSEONG HIEUH, HANGUL CHOSEONG PIEUP # →ᇂᆸ→ + +A97B ; 1112 1109 ; MA # ( ꥻ → ᄒᄉ ) HANGUL CHOSEONG HIEUH-SIOS → HANGUL CHOSEONG HIEUH, HANGUL CHOSEONG SIOS # + +1158 ; 1112 1112 ; MA # ( ᅘ → ᄒᄒ ) HANGUL CHOSEONG SSANGHIEUH → HANGUL CHOSEONG HIEUH, HANGUL CHOSEONG HIEUH # +3185 ; 1112 1112 ; MA # ( ㆅ → ᄒᄒ ) HANGUL LETTER SSANGHIEUH → HANGUL CHOSEONG HIEUH, HANGUL CHOSEONG HIEUH # →ᅘ→ + +113D ; 113C 113C ; MA # ( ᄽ → ᄼᄼ ) HANGUL CHOSEONG CHITUEUMSSANGSIOS → HANGUL CHOSEONG CHITUEUMSIOS, HANGUL CHOSEONG CHITUEUMSIOS # + +113F ; 113E 113E ; MA # ( ᄿ → ᄾᄾ ) HANGUL CHOSEONG CEONGCHIEUMSSANGSIOS → HANGUL CHOSEONG CEONGCHIEUMSIOS, HANGUL CHOSEONG CEONGCHIEUMSIOS # + +317F ; 1140 ; MA # ( ㅿ → ᅀ ) HANGUL LETTER PANSIOS → HANGUL CHOSEONG PANSIOS # +11EB ; 1140 ; MA # ( ᇫ → ᅀ ) HANGUL JONGSEONG PANSIOS → HANGUL CHOSEONG PANSIOS # + +D7F3 ; 1140 1107 ; MA # ( ퟳ → ᅀᄇ ) HANGUL JONGSEONG PANSIOS-PIEUP → HANGUL CHOSEONG PANSIOS, HANGUL CHOSEONG PIEUP # →ᇫᆸ→ + +D7F4 ; 1140 1107 110B ; MA # ( ퟴ → ᅀᄇᄋ ) HANGUL JONGSEONG PANSIOS-KAPYEOUNPIEUP → HANGUL CHOSEONG PANSIOS, HANGUL CHOSEONG PIEUP, HANGUL CHOSEONG IEUNG # →ᇫᆸᆼ→ + +3181 ; 114C ; MA # ( ㆁ → ᅌ ) HANGUL LETTER YESIEUNG → HANGUL CHOSEONG YESIEUNG # +11F0 ; 114C ; MA # ( ᇰ → ᅌ ) HANGUL JONGSEONG YESIEUNG → HANGUL CHOSEONG YESIEUNG # + +D7F5 ; 114C 1106 ; MA # ( ퟵ → ᅌᄆ ) HANGUL JONGSEONG YESIEUNG-MIEUM → HANGUL CHOSEONG YESIEUNG, HANGUL CHOSEONG MIEUM # →ᇰᆷ→ + +D7F6 ; 114C 1112 ; MA # ( ퟶ → ᅌᄒ ) HANGUL JONGSEONG YESIEUNG-HIEUH → HANGUL CHOSEONG YESIEUNG, HANGUL CHOSEONG HIEUH # →ᇰᇂ→ + +114F ; 114E 114E ; MA # ( ᅏ → ᅎᅎ ) HANGUL CHOSEONG CHITUEUMSSANGCIEUC → HANGUL CHOSEONG CHITUEUMCIEUC, HANGUL CHOSEONG CHITUEUMCIEUC # + +1151 ; 1150 1150 ; MA # ( ᅑ → ᅐᅐ ) HANGUL CHOSEONG CEONGCHIEUMSSANGCIEUC → HANGUL CHOSEONG CEONGCHIEUMCIEUC, HANGUL CHOSEONG CEONGCHIEUMCIEUC # + +3186 ; 1159 ; MA # ( ㆆ → ᅙ ) HANGUL LETTER YEORINHIEUH → HANGUL CHOSEONG YEORINHIEUH # +11F9 ; 1159 ; MA # ( ᇹ → ᅙ ) HANGUL JONGSEONG YEORINHIEUH → HANGUL CHOSEONG YEORINHIEUH # + +A97C ; 1159 1159 ; MA # ( ꥼ → ᅙᅙ ) HANGUL CHOSEONG SSANGYEORINHIEUH → HANGUL CHOSEONG YEORINHIEUH, HANGUL CHOSEONG YEORINHIEUH # + +3164 ; 1160 ; MA # ( → ) HANGUL FILLER → HANGUL JUNGSEONG FILLER # + +314F ; 1161 ; MA # ( ㅏ → ᅡ ) HANGUL LETTER A → HANGUL JUNGSEONG A # + +11A3 ; 1161 30FC ; MA # ( ᆣ → ᅡー ) HANGUL JUNGSEONG A-EU → HANGUL JUNGSEONG A, KATAKANA-HIRAGANA PROLONGED SOUND MARK # →ᅡᅳ→ + +1176 ; 1161 1169 ; MA # ( ᅶ → ᅡᅩ ) HANGUL JUNGSEONG A-O → HANGUL JUNGSEONG A, HANGUL JUNGSEONG O # + +1177 ; 1161 116E ; MA # ( ᅷ → ᅡᅮ ) HANGUL JUNGSEONG A-U → HANGUL JUNGSEONG A, HANGUL JUNGSEONG U # + +1162 ; 1161 4E28 ; MA # ( ᅢ → ᅡ丨 ) HANGUL JUNGSEONG AE → HANGUL JUNGSEONG A, CJK UNIFIED IDEOGRAPH-4E28 # →ᅡᅵ→ +3150 ; 1161 4E28 ; MA # ( ㅐ → ᅡ丨 ) HANGUL LETTER AE → HANGUL JUNGSEONG A, CJK UNIFIED IDEOGRAPH-4E28 # →ᅢ→→ᅡᅵ→ + +3151 ; 1163 ; MA # ( ㅑ → ᅣ ) HANGUL LETTER YA → HANGUL JUNGSEONG YA # + +1178 ; 1163 1169 ; MA # ( ᅸ → ᅣᅩ ) HANGUL JUNGSEONG YA-O → HANGUL JUNGSEONG YA, HANGUL JUNGSEONG O # + +1179 ; 1163 116D ; MA # ( ᅹ → ᅣᅭ ) HANGUL JUNGSEONG YA-YO → HANGUL JUNGSEONG YA, HANGUL JUNGSEONG YO # + +11A4 ; 1163 116E ; MA # ( ᆤ → ᅣᅮ ) HANGUL JUNGSEONG YA-U → HANGUL JUNGSEONG YA, HANGUL JUNGSEONG U # + +1164 ; 1163 4E28 ; MA # ( ᅤ → ᅣ丨 ) HANGUL JUNGSEONG YAE → HANGUL JUNGSEONG YA, CJK UNIFIED IDEOGRAPH-4E28 # →ᅣᅵ→ +3152 ; 1163 4E28 ; MA # ( ㅒ → ᅣ丨 ) HANGUL LETTER YAE → HANGUL JUNGSEONG YA, CJK UNIFIED IDEOGRAPH-4E28 # →ᅤ→→ᅣᅵ→ + +3153 ; 1165 ; MA # ( ㅓ → ᅥ ) HANGUL LETTER EO → HANGUL JUNGSEONG EO # + +117C ; 1165 30FC ; MA # ( ᅼ → ᅥー ) HANGUL JUNGSEONG EO-EU → HANGUL JUNGSEONG EO, KATAKANA-HIRAGANA PROLONGED SOUND MARK # →ᅥᅳ→ + +117A ; 1165 1169 ; MA # ( ᅺ → ᅥᅩ ) HANGUL JUNGSEONG EO-O → HANGUL JUNGSEONG EO, HANGUL JUNGSEONG O # + +117B ; 1165 116E ; MA # ( ᅻ → ᅥᅮ ) HANGUL JUNGSEONG EO-U → HANGUL JUNGSEONG EO, HANGUL JUNGSEONG U # + +1166 ; 1165 4E28 ; MA # ( ᅦ → ᅥ丨 ) HANGUL JUNGSEONG E → HANGUL JUNGSEONG EO, CJK UNIFIED IDEOGRAPH-4E28 # →ᅥᅵ→ +3154 ; 1165 4E28 ; MA # ( ㅔ → ᅥ丨 ) HANGUL LETTER E → HANGUL JUNGSEONG EO, CJK UNIFIED IDEOGRAPH-4E28 # →ᅦ→→ᅥᅵ→ + +3155 ; 1167 ; MA # ( ㅕ → ᅧ ) HANGUL LETTER YEO → HANGUL JUNGSEONG YEO # + +11A5 ; 1167 1163 ; MA # ( ᆥ → ᅧᅣ ) HANGUL JUNGSEONG YEO-YA → HANGUL JUNGSEONG YEO, HANGUL JUNGSEONG YA # + +117D ; 1167 1169 ; MA # ( ᅽ → ᅧᅩ ) HANGUL JUNGSEONG YEO-O → HANGUL JUNGSEONG YEO, HANGUL JUNGSEONG O # + +117E ; 1167 116E ; MA # ( ᅾ → ᅧᅮ ) HANGUL JUNGSEONG YEO-U → HANGUL JUNGSEONG YEO, HANGUL JUNGSEONG U # + +1168 ; 1167 4E28 ; MA # ( ᅨ → ᅧ丨 ) HANGUL JUNGSEONG YE → HANGUL JUNGSEONG YEO, CJK UNIFIED IDEOGRAPH-4E28 # →ᅧᅵ→ +3156 ; 1167 4E28 ; MA # ( ㅖ → ᅧ丨 ) HANGUL LETTER YE → HANGUL JUNGSEONG YEO, CJK UNIFIED IDEOGRAPH-4E28 # →ᅨ→→ᅧᅵ→ + +3157 ; 1169 ; MA # ( ㅗ → ᅩ ) HANGUL LETTER O → HANGUL JUNGSEONG O # + +116A ; 1169 1161 ; MA # ( ᅪ → ᅩᅡ ) HANGUL JUNGSEONG WA → HANGUL JUNGSEONG O, HANGUL JUNGSEONG A # +3158 ; 1169 1161 ; MA # ( ㅘ → ᅩᅡ ) HANGUL LETTER WA → HANGUL JUNGSEONG O, HANGUL JUNGSEONG A # →ᅪ→ + +116B ; 1169 1161 4E28 ; MA # ( ᅫ → ᅩᅡ丨 ) HANGUL JUNGSEONG WAE → HANGUL JUNGSEONG O, HANGUL JUNGSEONG A, CJK UNIFIED IDEOGRAPH-4E28 # →ᅩᅡᅵ→ +3159 ; 1169 1161 4E28 ; MA # ( ㅙ → ᅩᅡ丨 ) HANGUL LETTER WAE → HANGUL JUNGSEONG O, HANGUL JUNGSEONG A, CJK UNIFIED IDEOGRAPH-4E28 # →ᅫ→→ᅩᅡᅵ→ + +11A6 ; 1169 1163 ; MA # ( ᆦ → ᅩᅣ ) HANGUL JUNGSEONG O-YA → HANGUL JUNGSEONG O, HANGUL JUNGSEONG YA # + +11A7 ; 1169 1163 4E28 ; MA # ( ᆧ → ᅩᅣ丨 ) HANGUL JUNGSEONG O-YAE → HANGUL JUNGSEONG O, HANGUL JUNGSEONG YA, CJK UNIFIED IDEOGRAPH-4E28 # →ᅩᅣᅵ→ + +117F ; 1169 1165 ; MA # ( ᅿ → ᅩᅥ ) HANGUL JUNGSEONG O-EO → HANGUL JUNGSEONG O, HANGUL JUNGSEONG EO # + +1180 ; 1169 1165 4E28 ; MA # ( ᆀ → ᅩᅥ丨 ) HANGUL JUNGSEONG O-E → HANGUL JUNGSEONG O, HANGUL JUNGSEONG EO, CJK UNIFIED IDEOGRAPH-4E28 # →ᅩᅥᅵ→ + +D7B0 ; 1169 1167 ; MA # ( ힰ → ᅩᅧ ) HANGUL JUNGSEONG O-YEO → HANGUL JUNGSEONG O, HANGUL JUNGSEONG YEO # + +1181 ; 1169 1167 4E28 ; MA # ( ᆁ → ᅩᅧ丨 ) HANGUL JUNGSEONG O-YE → HANGUL JUNGSEONG O, HANGUL JUNGSEONG YEO, CJK UNIFIED IDEOGRAPH-4E28 # →ᅩᅧᅵ→ + +1182 ; 1169 1169 ; MA # ( ᆂ → ᅩᅩ ) HANGUL JUNGSEONG O-O → HANGUL JUNGSEONG O, HANGUL JUNGSEONG O # + +D7B1 ; 1169 1169 4E28 ; MA # ( ힱ → ᅩᅩ丨 ) HANGUL JUNGSEONG O-O-I → HANGUL JUNGSEONG O, HANGUL JUNGSEONG O, CJK UNIFIED IDEOGRAPH-4E28 # →ᅩᅩᅵ→ + +1183 ; 1169 116E ; MA # ( ᆃ → ᅩᅮ ) HANGUL JUNGSEONG O-U → HANGUL JUNGSEONG O, HANGUL JUNGSEONG U # + +116C ; 1169 4E28 ; MA # ( ᅬ → ᅩ丨 ) HANGUL JUNGSEONG OE → HANGUL JUNGSEONG O, CJK UNIFIED IDEOGRAPH-4E28 # →ᅩᅵ→ +315A ; 1169 4E28 ; MA # ( ㅚ → ᅩ丨 ) HANGUL LETTER OE → HANGUL JUNGSEONG O, CJK UNIFIED IDEOGRAPH-4E28 # →ᅬ→→ᅩᅵ→ + +315B ; 116D ; MA # ( ㅛ → ᅭ ) HANGUL LETTER YO → HANGUL JUNGSEONG YO # + +D7B2 ; 116D 1161 ; MA # ( ힲ → ᅭᅡ ) HANGUL JUNGSEONG YO-A → HANGUL JUNGSEONG YO, HANGUL JUNGSEONG A # + +D7B3 ; 116D 1161 4E28 ; MA # ( ힳ → ᅭᅡ丨 ) HANGUL JUNGSEONG YO-AE → HANGUL JUNGSEONG YO, HANGUL JUNGSEONG A, CJK UNIFIED IDEOGRAPH-4E28 # →ᅭᅡᅵ→ + +1184 ; 116D 1163 ; MA # ( ᆄ → ᅭᅣ ) HANGUL JUNGSEONG YO-YA → HANGUL JUNGSEONG YO, HANGUL JUNGSEONG YA # +3187 ; 116D 1163 ; MA # ( ㆇ → ᅭᅣ ) HANGUL LETTER YO-YA → HANGUL JUNGSEONG YO, HANGUL JUNGSEONG YA # →ᆄ→ +1186 ; 116D 1163 ; MA # ( ᆆ → ᅭᅣ ) HANGUL JUNGSEONG YO-YEO → HANGUL JUNGSEONG YO, HANGUL JUNGSEONG YA # →ᆄ→ + +1185 ; 116D 1163 4E28 ; MA # ( ᆅ → ᅭᅣ丨 ) HANGUL JUNGSEONG YO-YAE → HANGUL JUNGSEONG YO, HANGUL JUNGSEONG YA, CJK UNIFIED IDEOGRAPH-4E28 # →ᅭᅣᅵ→ +3188 ; 116D 1163 4E28 ; MA # ( ㆈ → ᅭᅣ丨 ) HANGUL LETTER YO-YAE → HANGUL JUNGSEONG YO, HANGUL JUNGSEONG YA, CJK UNIFIED IDEOGRAPH-4E28 # →ᆅ→→ᅭᅣᅵ→ + +D7B4 ; 116D 1165 ; MA # ( ힴ → ᅭᅥ ) HANGUL JUNGSEONG YO-EO → HANGUL JUNGSEONG YO, HANGUL JUNGSEONG EO # + +1187 ; 116D 1169 ; MA # ( ᆇ → ᅭᅩ ) HANGUL JUNGSEONG YO-O → HANGUL JUNGSEONG YO, HANGUL JUNGSEONG O # + +1188 ; 116D 4E28 ; MA # ( ᆈ → ᅭ丨 ) HANGUL JUNGSEONG YO-I → HANGUL JUNGSEONG YO, CJK UNIFIED IDEOGRAPH-4E28 # →ᅭᅵ→ +3189 ; 116D 4E28 ; MA # ( ㆉ → ᅭ丨 ) HANGUL LETTER YO-I → HANGUL JUNGSEONG YO, CJK UNIFIED IDEOGRAPH-4E28 # →ᆈ→→ᅭᅵ→ + +315C ; 116E ; MA # ( ㅜ → ᅮ ) HANGUL LETTER U → HANGUL JUNGSEONG U # + +1189 ; 116E 1161 ; MA # ( ᆉ → ᅮᅡ ) HANGUL JUNGSEONG U-A → HANGUL JUNGSEONG U, HANGUL JUNGSEONG A # + +118A ; 116E 1161 4E28 ; MA # ( ᆊ → ᅮᅡ丨 ) HANGUL JUNGSEONG U-AE → HANGUL JUNGSEONG U, HANGUL JUNGSEONG A, CJK UNIFIED IDEOGRAPH-4E28 # →ᅮᅡᅵ→ + +116F ; 116E 1165 ; MA # ( ᅯ → ᅮᅥ ) HANGUL JUNGSEONG WEO → HANGUL JUNGSEONG U, HANGUL JUNGSEONG EO # +315D ; 116E 1165 ; MA # ( ㅝ → ᅮᅥ ) HANGUL LETTER WEO → HANGUL JUNGSEONG U, HANGUL JUNGSEONG EO # →ᅯ→ + +118B ; 116E 1165 30FC ; MA # ( ᆋ → ᅮᅥー ) HANGUL JUNGSEONG U-EO-EU → HANGUL JUNGSEONG U, HANGUL JUNGSEONG EO, KATAKANA-HIRAGANA PROLONGED SOUND MARK # →ᅮᅥᅳ→ + +1170 ; 116E 1165 4E28 ; MA # ( ᅰ → ᅮᅥ丨 ) HANGUL JUNGSEONG WE → HANGUL JUNGSEONG U, HANGUL JUNGSEONG EO, CJK UNIFIED IDEOGRAPH-4E28 # →ᅮᅥᅵ→ +315E ; 116E 1165 4E28 ; MA # ( ㅞ → ᅮᅥ丨 ) HANGUL LETTER WE → HANGUL JUNGSEONG U, HANGUL JUNGSEONG EO, CJK UNIFIED IDEOGRAPH-4E28 # →ᅰ→→ᅮᅥᅵ→ + +D7B5 ; 116E 1167 ; MA # ( ힵ → ᅮᅧ ) HANGUL JUNGSEONG U-YEO → HANGUL JUNGSEONG U, HANGUL JUNGSEONG YEO # + +118C ; 116E 1167 4E28 ; MA # ( ᆌ → ᅮᅧ丨 ) HANGUL JUNGSEONG U-YE → HANGUL JUNGSEONG U, HANGUL JUNGSEONG YEO, CJK UNIFIED IDEOGRAPH-4E28 # →ᅮᅧᅵ→ + +118D ; 116E 116E ; MA # ( ᆍ → ᅮᅮ ) HANGUL JUNGSEONG U-U → HANGUL JUNGSEONG U, HANGUL JUNGSEONG U # + +1171 ; 116E 4E28 ; MA # ( ᅱ → ᅮ丨 ) HANGUL JUNGSEONG WI → HANGUL JUNGSEONG U, CJK UNIFIED IDEOGRAPH-4E28 # →ᅮᅵ→ +315F ; 116E 4E28 ; MA # ( ㅟ → ᅮ丨 ) HANGUL LETTER WI → HANGUL JUNGSEONG U, CJK UNIFIED IDEOGRAPH-4E28 # →ᅱ→→ᅮᅵ→ + +D7B6 ; 116E 4E28 4E28 ; MA # ( ힶ → ᅮ丨丨 ) HANGUL JUNGSEONG U-I-I → HANGUL JUNGSEONG U, CJK UNIFIED IDEOGRAPH-4E28, CJK UNIFIED IDEOGRAPH-4E28 # →ᅮᅵᅵ→ + +3160 ; 1172 ; MA # ( ㅠ → ᅲ ) HANGUL LETTER YU → HANGUL JUNGSEONG YU # + +118E ; 1172 1161 ; MA # ( ᆎ → ᅲᅡ ) HANGUL JUNGSEONG YU-A → HANGUL JUNGSEONG YU, HANGUL JUNGSEONG A # + +D7B7 ; 1172 1161 4E28 ; MA # ( ힷ → ᅲᅡ丨 ) HANGUL JUNGSEONG YU-AE → HANGUL JUNGSEONG YU, HANGUL JUNGSEONG A, CJK UNIFIED IDEOGRAPH-4E28 # →ᅲᅡᅵ→ + +118F ; 1172 1165 ; MA # ( ᆏ → ᅲᅥ ) HANGUL JUNGSEONG YU-EO → HANGUL JUNGSEONG YU, HANGUL JUNGSEONG EO # + +1190 ; 1172 1165 4E28 ; MA # ( ᆐ → ᅲᅥ丨 ) HANGUL JUNGSEONG YU-E → HANGUL JUNGSEONG YU, HANGUL JUNGSEONG EO, CJK UNIFIED IDEOGRAPH-4E28 # →ᅲᅥᅵ→ + +1191 ; 1172 1167 ; MA # ( ᆑ → ᅲᅧ ) HANGUL JUNGSEONG YU-YEO → HANGUL JUNGSEONG YU, HANGUL JUNGSEONG YEO # +318A ; 1172 1167 ; MA # ( ㆊ → ᅲᅧ ) HANGUL LETTER YU-YEO → HANGUL JUNGSEONG YU, HANGUL JUNGSEONG YEO # →ᆑ→ + +1192 ; 1172 1167 4E28 ; MA # ( ᆒ → ᅲᅧ丨 ) HANGUL JUNGSEONG YU-YE → HANGUL JUNGSEONG YU, HANGUL JUNGSEONG YEO, CJK UNIFIED IDEOGRAPH-4E28 # →ᅲᅧᅵ→ +318B ; 1172 1167 4E28 ; MA # ( ㆋ → ᅲᅧ丨 ) HANGUL LETTER YU-YE → HANGUL JUNGSEONG YU, HANGUL JUNGSEONG YEO, CJK UNIFIED IDEOGRAPH-4E28 # →ᆒ→→ᅲᅧᅵ→ + +D7B8 ; 1172 1169 ; MA # ( ힸ → ᅲᅩ ) HANGUL JUNGSEONG YU-O → HANGUL JUNGSEONG YU, HANGUL JUNGSEONG O # + +1193 ; 1172 116E ; MA # ( ᆓ → ᅲᅮ ) HANGUL JUNGSEONG YU-U → HANGUL JUNGSEONG YU, HANGUL JUNGSEONG U # + +1194 ; 1172 4E28 ; MA # ( ᆔ → ᅲ丨 ) HANGUL JUNGSEONG YU-I → HANGUL JUNGSEONG YU, CJK UNIFIED IDEOGRAPH-4E28 # →ᅲᅵ→ +318C ; 1172 4E28 ; MA # ( ㆌ → ᅲ丨 ) HANGUL LETTER YU-I → HANGUL JUNGSEONG YU, CJK UNIFIED IDEOGRAPH-4E28 # →ᆔ→→ᅲᅵ→ + +318D ; 119E ; MA # ( ㆍ → ᆞ ) HANGUL LETTER ARAEA → HANGUL JUNGSEONG ARAEA # + +D7C5 ; 119E 1161 ; MA # ( ퟅ → ᆞᅡ ) HANGUL JUNGSEONG ARAEA-A → HANGUL JUNGSEONG ARAEA, HANGUL JUNGSEONG A # + +119F ; 119E 1165 ; MA # ( ᆟ → ᆞᅥ ) HANGUL JUNGSEONG ARAEA-EO → HANGUL JUNGSEONG ARAEA, HANGUL JUNGSEONG EO # + +D7C6 ; 119E 1165 4E28 ; MA # ( ퟆ → ᆞᅥ丨 ) HANGUL JUNGSEONG ARAEA-E → HANGUL JUNGSEONG ARAEA, HANGUL JUNGSEONG EO, CJK UNIFIED IDEOGRAPH-4E28 # →ᆞᅥᅵ→ + +11A0 ; 119E 116E ; MA # ( ᆠ → ᆞᅮ ) HANGUL JUNGSEONG ARAEA-U → HANGUL JUNGSEONG ARAEA, HANGUL JUNGSEONG U # + +11A2 ; 119E 119E ; MA # ( ᆢ → ᆞᆞ ) HANGUL JUNGSEONG SSANGARAEA → HANGUL JUNGSEONG ARAEA, HANGUL JUNGSEONG ARAEA # + +11A1 ; 119E 4E28 ; MA # ( ᆡ → ᆞ丨 ) HANGUL JUNGSEONG ARAEA-I → HANGUL JUNGSEONG ARAEA, CJK UNIFIED IDEOGRAPH-4E28 # →ᆞᅵ→ +318E ; 119E 4E28 ; MA # ( ㆎ → ᆞ丨 ) HANGUL LETTER ARAEAE → HANGUL JUNGSEONG ARAEA, CJK UNIFIED IDEOGRAPH-4E28 # →ᆡ→→ᆞᅵ→ + +30D8 ; 3078 ; MA # ( ヘ → へ ) KATAKANA LETTER HE → HIRAGANA LETTER HE # + +2341 ; 303C ; MA #* ( ⍁ → 〼 ) APL FUNCTIONAL SYMBOL QUAD SLASH → MASU MARK # →⧄→ +29C4 ; 303C ; MA #* ( ⧄ → 〼 ) SQUARED RISING DIAGONAL SLASH → MASU MARK # + +4E8E ; 1B122 ; MA # ( 于 → 𛄢 ) CJK UNIFIED IDEOGRAPH-4E8E → KATAKANA LETTER ARCHAIC WU # + +A49E ; A04A ; MA #* ( ꒞ → ꁊ ) YI RADICAL PUT → YI SYLLABLE PUT # + +A4AC ; A050 ; MA #* ( ꒬ → ꁐ ) YI RADICAL PYT → YI SYLLABLE PYT # + +A49C ; A0C0 ; MA #* ( ꒜ → ꃀ ) YI RADICAL MOP → YI SYLLABLE MOP # + +A4A8 ; A132 ; MA #* ( ꒨ → ꄲ ) YI RADICAL TU → YI SYLLABLE TU # + +A4BF ; A259 ; MA #* ( ꒿ → ꉙ ) YI RADICAL HXOP → YI SYLLABLE HXOP # + +A4BE ; A2B1 ; MA #* ( ꒾ → ꊱ ) YI RADICAL CIP → YI SYLLABLE CIP # + +A494 ; A2CD ; MA #* ( ꒔ → ꋍ ) YI RADICAL CYP → YI SYLLABLE CYP # + +A4C0 ; A3AB ; MA #* ( ꓀ → ꎫ ) YI RADICAL SHAT → YI SYLLABLE SHAT # + +A4C2 ; A3B5 ; MA #* ( ꓂ → ꎵ ) YI RADICAL SHOP → YI SYLLABLE SHOP # + +A4BA ; A3BF ; MA #* ( ꒺ → ꎿ ) YI RADICAL SHUR → YI SYLLABLE SHUR # + +A4B0 ; A3C2 ; MA #* ( ꒰ → ꏂ ) YI RADICAL SHY → YI SYLLABLE SHY # + +A4A7 ; A458 ; MA #* ( ꒧ → ꑘ ) YI RADICAL NYOP → YI SYLLABLE NYOP # + +22A5 ; A4D5 ; MA #* ( ⊥ → ꓕ ) UP TACK → LISU LETTER THA # +27C2 ; A4D5 ; MA #* ( ⟂ → ꓕ ) PERPENDICULAR → LISU LETTER THA # →⊥→ +1D21C ; A4D5 ; MA #* ( 𝈜 → ꓕ ) GREEK VOCAL NOTATION SYMBOL-54 → LISU LETTER THA # →Ʇ→ +A7B1 ; A4D5 ; MA # ( Ʇ → ꓕ ) LATIN CAPITAL LETTER TURNED T → LISU LETTER THA # + +A79E ; A4E4 ; MA # ( Ꞟ → ꓤ ) LATIN CAPITAL LETTER VOLAPUK UE → LISU LETTER ZA # + +2141 ; A4E8 ; MA #* ( ⅁ → ꓨ ) TURNED SANS-SERIF CAPITAL G → LISU LETTER HHA # + +2142 ; A4F6 ; MA #* ( ⅂ → ꓶ ) TURNED SANS-SERIF CAPITAL L → LISU LETTER UH # +1D215 ; A4F6 ; MA #* ( 𝈕 → ꓶ ) GREEK VOCAL NOTATION SYMBOL-22 → LISU LETTER UH # →⅂→ +1D22B ; A4F6 ; MA #* ( 𝈫 → ꓶ ) GREEK INSTRUMENTAL NOTATION SYMBOL-24 → LISU LETTER UH # →𝈕→→⅂→ +16F26 ; A4F6 ; MA # ( 𖼦 → ꓶ ) MIAO LETTER HA → LISU LETTER UH # →⅂→ +10411 ; A4F6 ; MA # ( 𐐑 → ꓶ ) DESERET CAPITAL LETTER PEE → LISU LETTER UH # →⅂→ + +2143 ; 16F00 ; MA #* ( ⅃ → 𖼀 ) REVERSED SANS-SERIF CAPITAL L → MIAO LETTER PA # + +11AE6 ; 11AE5 11AEF ; MA # ( 𑫦 → 𑫥𑫯 ) PAU CIN HAU RISING TONE → PAU CIN HAU RISING TONE LONG, PAU CIN HAU MID-LEVEL TONE # + +11AE8 ; 11AE5 11AE5 ; MA # ( 𑫨 → 𑫥𑫥 ) PAU CIN HAU RISING TONE LONG FINAL → PAU CIN HAU RISING TONE LONG, PAU CIN HAU RISING TONE LONG # + +11AE9 ; 11AE5 11AE5 11AEF ; MA # ( 𑫩 → 𑫥𑫥𑫯 ) PAU CIN HAU RISING TONE FINAL → PAU CIN HAU RISING TONE LONG, PAU CIN HAU RISING TONE LONG, PAU CIN HAU MID-LEVEL TONE # →𑫥𑫦→ + +11AEA ; 11AE5 11AE5 11AF0 ; MA # ( 𑫪 → 𑫥𑫥𑫰 ) PAU CIN HAU SANDHI GLOTTAL STOP FINAL → PAU CIN HAU RISING TONE LONG, PAU CIN HAU RISING TONE LONG, PAU CIN HAU GLOTTAL STOP VARIANT # →𑫥𑫧→ + +11AE7 ; 11AE5 11AF0 ; MA # ( 𑫧 → 𑫥𑫰 ) PAU CIN HAU SANDHI GLOTTAL STOP → PAU CIN HAU RISING TONE LONG, PAU CIN HAU GLOTTAL STOP VARIANT # + +11AF4 ; 11AF3 11AEF ; MA # ( 𑫴 → 𑫳𑫯 ) PAU CIN HAU LOW-FALLING TONE → PAU CIN HAU LOW-FALLING TONE LONG, PAU CIN HAU MID-LEVEL TONE # + +11AF6 ; 11AF3 11AF3 ; MA # ( 𑫶 → 𑫳𑫳 ) PAU CIN HAU LOW-FALLING TONE LONG FINAL → PAU CIN HAU LOW-FALLING TONE LONG, PAU CIN HAU LOW-FALLING TONE LONG # + +11AF7 ; 11AF3 11AF3 11AEF ; MA # ( 𑫷 → 𑫳𑫳𑫯 ) PAU CIN HAU LOW-FALLING TONE FINAL → PAU CIN HAU LOW-FALLING TONE LONG, PAU CIN HAU LOW-FALLING TONE LONG, PAU CIN HAU MID-LEVEL TONE # →𑫳𑫴→ + +11AF8 ; 11AF3 11AF3 11AF0 ; MA # ( 𑫸 → 𑫳𑫳𑫰 ) PAU CIN HAU GLOTTAL STOP FINAL → PAU CIN HAU LOW-FALLING TONE LONG, PAU CIN HAU LOW-FALLING TONE LONG, PAU CIN HAU GLOTTAL STOP VARIANT # →𑫳𑫵→ + +11AF5 ; 11AF3 11AF0 ; MA # ( 𑫵 → 𑫳𑫰 ) PAU CIN HAU GLOTTAL STOP → PAU CIN HAU LOW-FALLING TONE LONG, PAU CIN HAU GLOTTAL STOP VARIANT # + +11AEC ; 11AEB 11AEF ; MA # ( 𑫬 → 𑫫𑫯 ) PAU CIN HAU SANDHI TONE → PAU CIN HAU SANDHI TONE LONG, PAU CIN HAU MID-LEVEL TONE # + +11AED ; 11AEB 11AEB ; MA # ( 𑫭 → 𑫫𑫫 ) PAU CIN HAU SANDHI TONE LONG FINAL → PAU CIN HAU SANDHI TONE LONG, PAU CIN HAU SANDHI TONE LONG # + +11AEE ; 11AEB 11AEB 11AEF ; MA # ( 𑫮 → 𑫫𑫫𑫯 ) PAU CIN HAU SANDHI TONE FINAL → PAU CIN HAU SANDHI TONE LONG, PAU CIN HAU SANDHI TONE LONG, PAU CIN HAU MID-LEVEL TONE # →𑫫𑫬→ + +2295 ; 102A8 ; MA #* ( ⊕ → 𐊨 ) CIRCLED PLUS → CARIAN LETTER Q # +2A01 ; 102A8 ; MA #* ( ⨁ → 𐊨 ) N-ARY CIRCLED PLUS OPERATOR → CARIAN LETTER Q # →⊕→ +1F728 ; 102A8 ; MA #* ( 🜨 → 𐊨 ) ALCHEMICAL SYMBOL FOR VERDIGRIS → CARIAN LETTER Q # →⊕→ +A69A ; 102A8 ; MA # ( Ꚛ → 𐊨 ) CYRILLIC CAPITAL LETTER CROSSED O → CARIAN LETTER Q # →⊕→ + +25BD ; 102BC ; MA #* ( ▽ → 𐊼 ) WHITE DOWN-POINTING TRIANGLE → CARIAN LETTER K # +1D214 ; 102BC ; MA #* ( 𝈔 → 𐊼 ) GREEK VOCAL NOTATION SYMBOL-21 → CARIAN LETTER K # →▽→ +1F704 ; 102BC ; MA #* ( 🜄 → 𐊼 ) ALCHEMICAL SYMBOL FOR WATER → CARIAN LETTER K # →▽→ + +29D6 ; 102C0 ; MA #* ( ⧖ → 𐋀 ) WHITE HOURGLASS → CARIAN LETTER G # + +A79B ; 1043A ; MA # ( ꞛ → 𐐺 ) LATIN SMALL LETTER VOLAPUK AE → DESERET SMALL LETTER BEE # + +A79A ; 10412 ; MA # ( Ꞛ → 𐐒 ) LATIN CAPITAL LETTER VOLAPUK AE → DESERET CAPITAL LETTER BEE # + +104A0 ; 10486 ; MA # ( 𐒠 → 𐒆 ) OSMANYA DIGIT ZERO → OSMANYA LETTER DEEL # + +103D1 ; 10382 ; MA # ( 𐏑 → 𐎂 ) OLD PERSIAN NUMBER ONE → UGARITIC LETTER GAMLA # + +103D3 ; 10393 ; MA # ( 𐏓 → 𐎓 ) OLD PERSIAN NUMBER TEN → UGARITIC LETTER AIN # + +12038 ; 1039A ; MA # ( 𒀸 → 𐎚 ) CUNEIFORM SIGN ASH → UGARITIC LETTER TO # + +2625 ; 1099E ; MA #* ( ☥ → ‎𐦞‎ ) ANKH → MEROITIC HIEROGLYPHIC SYMBOL VIDJ # +132F9 ; 1099E ; MA # ( 𓋹 → ‎𐦞‎ ) EGYPTIAN HIEROGLYPH S034 → MEROITIC HIEROGLYPHIC SYMBOL VIDJ # →☥→ + +3039 ; 5344 ; MA # ( 〹 → 卄 ) HANGZHOU NUMERAL TWENTY → CJK UNIFIED IDEOGRAPH-5344 # + +F967 ; 4E0D ; MA # ( 不 → 不 ) CJK COMPATIBILITY IDEOGRAPH-F967 → CJK UNIFIED IDEOGRAPH-4E0D # + +2F800 ; 4E3D ; MA # ( 丽 → 丽 ) CJK COMPATIBILITY IDEOGRAPH-2F800 → CJK UNIFIED IDEOGRAPH-4E3D # + +FA70 ; 4E26 ; MA # ( 並 → 並 ) CJK COMPATIBILITY IDEOGRAPH-FA70 → CJK UNIFIED IDEOGRAPH-4E26 # + +239C ; 4E28 ; MA #* ( ⎜ → 丨 ) LEFT PARENTHESIS EXTENSION → CJK UNIFIED IDEOGRAPH-4E28 # →⎥→→⎮→ +239F ; 4E28 ; MA #* ( ⎟ → 丨 ) RIGHT PARENTHESIS EXTENSION → CJK UNIFIED IDEOGRAPH-4E28 # →⎥→→⎮→ +23A2 ; 4E28 ; MA #* ( ⎢ → 丨 ) LEFT SQUARE BRACKET EXTENSION → CJK UNIFIED IDEOGRAPH-4E28 # →⎥→→⎮→ +23A5 ; 4E28 ; MA #* ( ⎥ → 丨 ) RIGHT SQUARE BRACKET EXTENSION → CJK UNIFIED IDEOGRAPH-4E28 # →⎮→ +23AA ; 4E28 ; MA #* ( ⎪ → 丨 ) CURLY BRACKET EXTENSION → CJK UNIFIED IDEOGRAPH-4E28 # →⎥→→⎮→ +23AE ; 4E28 ; MA #* ( ⎮ → 丨 ) INTEGRAL EXTENSION → CJK UNIFIED IDEOGRAPH-4E28 # +31D1 ; 4E28 ; MA #* ( ㇑ → 丨 ) CJK STROKE S → CJK UNIFIED IDEOGRAPH-4E28 # +1175 ; 4E28 ; MA # ( ᅵ → 丨 ) HANGUL JUNGSEONG I → CJK UNIFIED IDEOGRAPH-4E28 # →ㅣ→ +3163 ; 4E28 ; MA # ( ㅣ → 丨 ) HANGUL LETTER I → CJK UNIFIED IDEOGRAPH-4E28 # +2F01 ; 4E28 ; MA #* ( ⼁ → 丨 ) KANGXI RADICAL LINE → CJK UNIFIED IDEOGRAPH-4E28 # + +119C ; 4E28 30FC ; MA # ( ᆜ → 丨ー ) HANGUL JUNGSEONG I-EU → CJK UNIFIED IDEOGRAPH-4E28, KATAKANA-HIRAGANA PROLONGED SOUND MARK # →ᅵᅳ→ + +1198 ; 4E28 1161 ; MA # ( ᆘ → 丨ᅡ ) HANGUL JUNGSEONG I-A → CJK UNIFIED IDEOGRAPH-4E28, HANGUL JUNGSEONG A # →ᅵᅡ→ + +1199 ; 4E28 1163 ; MA # ( ᆙ → 丨ᅣ ) HANGUL JUNGSEONG I-YA → CJK UNIFIED IDEOGRAPH-4E28, HANGUL JUNGSEONG YA # →ᅵᅣ→ + +D7BD ; 4E28 1163 1169 ; MA # ( ힽ → 丨ᅣᅩ ) HANGUL JUNGSEONG I-YA-O → CJK UNIFIED IDEOGRAPH-4E28, HANGUL JUNGSEONG YA, HANGUL JUNGSEONG O # →ᅵᅣᅩ→ + +D7BE ; 4E28 1163 4E28 ; MA # ( ힾ → 丨ᅣ丨 ) HANGUL JUNGSEONG I-YAE → CJK UNIFIED IDEOGRAPH-4E28, HANGUL JUNGSEONG YA, CJK UNIFIED IDEOGRAPH-4E28 # →ᅵᅣᅵ→ + +D7BF ; 4E28 1167 ; MA # ( ힿ → 丨ᅧ ) HANGUL JUNGSEONG I-YEO → CJK UNIFIED IDEOGRAPH-4E28, HANGUL JUNGSEONG YEO # →ᅵᅧ→ + +D7C0 ; 4E28 1167 4E28 ; MA # ( ퟀ → 丨ᅧ丨 ) HANGUL JUNGSEONG I-YE → CJK UNIFIED IDEOGRAPH-4E28, HANGUL JUNGSEONG YEO, CJK UNIFIED IDEOGRAPH-4E28 # →ᅵᅧᅵ→ + +119A ; 4E28 1169 ; MA # ( ᆚ → 丨ᅩ ) HANGUL JUNGSEONG I-O → CJK UNIFIED IDEOGRAPH-4E28, HANGUL JUNGSEONG O # →ᅵᅩ→ + +D7C1 ; 4E28 1169 4E28 ; MA # ( ퟁ → 丨ᅩ丨 ) HANGUL JUNGSEONG I-O-I → CJK UNIFIED IDEOGRAPH-4E28, HANGUL JUNGSEONG O, CJK UNIFIED IDEOGRAPH-4E28 # →ᅵᅩᅵ→ + +D7C2 ; 4E28 116D ; MA # ( ퟂ → 丨ᅭ ) HANGUL JUNGSEONG I-YO → CJK UNIFIED IDEOGRAPH-4E28, HANGUL JUNGSEONG YO # →ᅵᅭ→ + +119B ; 4E28 116E ; MA # ( ᆛ → 丨ᅮ ) HANGUL JUNGSEONG I-U → CJK UNIFIED IDEOGRAPH-4E28, HANGUL JUNGSEONG U # →ᅵᅮ→ + +D7C3 ; 4E28 1172 ; MA # ( ퟃ → 丨ᅲ ) HANGUL JUNGSEONG I-YU → CJK UNIFIED IDEOGRAPH-4E28, HANGUL JUNGSEONG YU # →ᅵᅲ→ + +119D ; 4E28 119E ; MA # ( ᆝ → 丨ᆞ ) HANGUL JUNGSEONG I-ARAEA → CJK UNIFIED IDEOGRAPH-4E28, HANGUL JUNGSEONG ARAEA # →ᅵᆞ→ + +D7C4 ; 4E28 4E28 ; MA # ( ퟄ → 丨丨 ) HANGUL JUNGSEONG I-I → CJK UNIFIED IDEOGRAPH-4E28, CJK UNIFIED IDEOGRAPH-4E28 # →ᅵᅵ→ + +F905 ; 4E32 ; MA # ( 串 → 串 ) CJK COMPATIBILITY IDEOGRAPH-F905 → CJK UNIFIED IDEOGRAPH-4E32 # + +2F801 ; 4E38 ; MA # ( 丸 → 丸 ) CJK COMPATIBILITY IDEOGRAPH-2F801 → CJK UNIFIED IDEOGRAPH-4E38 # + +F95E ; 4E39 ; MA # ( 丹 → 丹 ) CJK COMPATIBILITY IDEOGRAPH-F95E → CJK UNIFIED IDEOGRAPH-4E39 # + +2F802 ; 4E41 ; MA # ( 乁 → 乁 ) CJK COMPATIBILITY IDEOGRAPH-2F802 → CJK UNIFIED IDEOGRAPH-4E41 # + +31E0 ; 4E59 ; MA #* ( ㇠ → 乙 ) CJK STROKE HXWG → CJK UNIFIED IDEOGRAPH-4E59 # +2F04 ; 4E59 ; MA #* ( ⼄ → 乙 ) KANGXI RADICAL SECOND → CJK UNIFIED IDEOGRAPH-4E59 # + +31DF ; 4E5A ; MA #* ( ㇟ → 乚 ) CJK STROKE SWG → CJK UNIFIED IDEOGRAPH-4E5A # +2E83 ; 4E5A ; MA #* ( ⺃ → 乚 ) CJK RADICAL SECOND TWO → CJK UNIFIED IDEOGRAPH-4E5A # + +31D6 ; 4E5B ; MA #* ( ㇖ → 乛 ) CJK STROKE HG → CJK UNIFIED IDEOGRAPH-4E5B # +2E82 ; 4E5B ; MA #* ( ⺂ → 乛 ) CJK RADICAL SECOND ONE → CJK UNIFIED IDEOGRAPH-4E5B # →㇖→ + +2EF2 ; 4E80 ; MA #* ( ⻲ → 亀 ) CJK RADICAL J-SIMPLIFIED TURTLE → CJK UNIFIED IDEOGRAPH-4E80 # + +F91B ; 4E82 ; MA # ( 亂 → 亂 ) CJK COMPATIBILITY IDEOGRAPH-F91B → CJK UNIFIED IDEOGRAPH-4E82 # + +31DA ; 4E85 ; MA #* ( ㇚ → 亅 ) CJK STROKE SG → CJK UNIFIED IDEOGRAPH-4E85 # +2F05 ; 4E85 ; MA #* ( ⼅ → 亅 ) KANGXI RADICAL HOOK → CJK UNIFIED IDEOGRAPH-4E85 # + +F9BA ; 4E86 ; MA # ( 了 → 了 ) CJK COMPATIBILITY IDEOGRAPH-F9BA → CJK UNIFIED IDEOGRAPH-4E86 # + +30CB ; 4E8C ; MA # ( ニ → 二 ) KATAKANA LETTER NI → CJK UNIFIED IDEOGRAPH-4E8C # +2F06 ; 4E8C ; MA #* ( ⼆ → 二 ) KANGXI RADICAL TWO → CJK UNIFIED IDEOGRAPH-4E8C # + +2F803 ; 20122 ; MA # ( 𠄢 → 𠄢 ) CJK COMPATIBILITY IDEOGRAPH-2F803 → CJK UNIFIED IDEOGRAPH-20122 # + +2F07 ; 4EA0 ; MA #* ( ⼇ → 亠 ) KANGXI RADICAL LID → CJK UNIFIED IDEOGRAPH-4EA0 # + +F977 ; 4EAE ; MA # ( 亮 → 亮 ) CJK COMPATIBILITY IDEOGRAPH-F977 → CJK UNIFIED IDEOGRAPH-4EAE # + +2F08 ; 4EBA ; MA #* ( ⼈ → 人 ) KANGXI RADICAL MAN → CJK UNIFIED IDEOGRAPH-4EBA # + +30A4 ; 4EBB ; MA # ( イ → 亻 ) KATAKANA LETTER I → CJK UNIFIED IDEOGRAPH-4EBB # →⺅→ +2E85 ; 4EBB ; MA #* ( ⺅ → 亻 ) CJK RADICAL PERSON → CJK UNIFIED IDEOGRAPH-4EBB # + +F9FD ; 4EC0 ; MA # ( 什 → 什 ) CJK COMPATIBILITY IDEOGRAPH-F9FD → CJK UNIFIED IDEOGRAPH-4EC0 # + +2F819 ; 4ECC ; MA # ( 仌 → 仌 ) CJK COMPATIBILITY IDEOGRAPH-2F819 → CJK UNIFIED IDEOGRAPH-4ECC # + +F9A8 ; 4EE4 ; MA # ( 令 → 令 ) CJK COMPATIBILITY IDEOGRAPH-F9A8 → CJK UNIFIED IDEOGRAPH-4EE4 # + +2F804 ; 4F60 ; MA # ( 你 → 你 ) CJK COMPATIBILITY IDEOGRAPH-2F804 → CJK UNIFIED IDEOGRAPH-4F60 # + +5002 ; 4F75 ; MA # ( 倂 → 併 ) CJK UNIFIED IDEOGRAPH-5002 → CJK UNIFIED IDEOGRAPH-4F75 # +2F807 ; 4F75 ; MA # ( 倂 → 併 ) CJK COMPATIBILITY IDEOGRAPH-2F807 → CJK UNIFIED IDEOGRAPH-4F75 # →倂→ + +FA73 ; 4F80 ; MA # ( 侀 → 侀 ) CJK COMPATIBILITY IDEOGRAPH-FA73 → CJK UNIFIED IDEOGRAPH-4F80 # + +F92D ; 4F86 ; MA # ( 來 → 來 ) CJK COMPATIBILITY IDEOGRAPH-F92D → CJK UNIFIED IDEOGRAPH-4F86 # + +F9B5 ; 4F8B ; MA # ( 例 → 例 ) CJK COMPATIBILITY IDEOGRAPH-F9B5 → CJK UNIFIED IDEOGRAPH-4F8B # + +FA30 ; 4FAE ; MA # ( 侮 → 侮 ) CJK COMPATIBILITY IDEOGRAPH-FA30 → CJK UNIFIED IDEOGRAPH-4FAE # +2F805 ; 4FAE ; MA # ( 侮 → 侮 ) CJK COMPATIBILITY IDEOGRAPH-2F805 → CJK UNIFIED IDEOGRAPH-4FAE # + +2F806 ; 4FBB ; MA # ( 侻 → 侻 ) CJK COMPATIBILITY IDEOGRAPH-2F806 → CJK UNIFIED IDEOGRAPH-4FBB # + +F965 ; 4FBF ; MA # ( 便 → 便 ) CJK COMPATIBILITY IDEOGRAPH-F965 → CJK UNIFIED IDEOGRAPH-4FBF # + +503C ; 5024 ; MA # ( 值 → 値 ) CJK UNIFIED IDEOGRAPH-503C → CJK UNIFIED IDEOGRAPH-5024 # + +F9D4 ; 502B ; MA # ( 倫 → 倫 ) CJK COMPATIBILITY IDEOGRAPH-F9D4 → CJK UNIFIED IDEOGRAPH-502B # + +2F808 ; 507A ; MA # ( 偺 → 偺 ) CJK COMPATIBILITY IDEOGRAPH-2F808 → CJK UNIFIED IDEOGRAPH-507A # + +2F809 ; 5099 ; MA # ( 備 → 備 ) CJK COMPATIBILITY IDEOGRAPH-2F809 → CJK UNIFIED IDEOGRAPH-5099 # + +2F80B ; 50CF ; MA # ( 像 → 像 ) CJK COMPATIBILITY IDEOGRAPH-2F80B → CJK UNIFIED IDEOGRAPH-50CF # + +F9BB ; 50DA ; MA # ( 僚 → 僚 ) CJK COMPATIBILITY IDEOGRAPH-F9BB → CJK UNIFIED IDEOGRAPH-50DA # + +FA31 ; 50E7 ; MA # ( 僧 → 僧 ) CJK COMPATIBILITY IDEOGRAPH-FA31 → CJK UNIFIED IDEOGRAPH-50E7 # +2F80A ; 50E7 ; MA # ( 僧 → 僧 ) CJK COMPATIBILITY IDEOGRAPH-2F80A → CJK UNIFIED IDEOGRAPH-50E7 # + +2F80C ; 349E ; MA # ( 㒞 → 㒞 ) CJK COMPATIBILITY IDEOGRAPH-2F80C → CJK UNIFIED IDEOGRAPH-349E # + +3126 ; 513F ; MA # ( ㄦ → 儿 ) BOPOMOFO LETTER ER → CJK UNIFIED IDEOGRAPH-513F # +2F09 ; 513F ; MA #* ( ⼉ → 儿 ) KANGXI RADICAL LEGS → CJK UNIFIED IDEOGRAPH-513F # +16FF2 ; 513F ; MA # ( 𖿲 → 儿 ) CHINESE SMALL SIMPLIFIED ER → CJK UNIFIED IDEOGRAPH-513F # + +FA0C ; 5140 ; MA # ( 兀 → 兀 ) CJK COMPATIBILITY IDEOGRAPH-FA0C → CJK UNIFIED IDEOGRAPH-5140 # +2E8E ; 5140 ; MA #* ( ⺎ → 兀 ) CJK RADICAL LAME ONE → CJK UNIFIED IDEOGRAPH-5140 # + +FA74 ; 5145 ; MA # ( 充 → 充 ) CJK COMPATIBILITY IDEOGRAPH-FA74 → CJK UNIFIED IDEOGRAPH-5145 # + +FA32 ; 514D ; MA # ( 免 → 免 ) CJK COMPATIBILITY IDEOGRAPH-FA32 → CJK UNIFIED IDEOGRAPH-514D # +2F80E ; 514D ; MA # ( 免 → 免 ) CJK COMPATIBILITY IDEOGRAPH-2F80E → CJK UNIFIED IDEOGRAPH-514D # + +2F80F ; 5154 ; MA # ( 兔 → 兔 ) CJK COMPATIBILITY IDEOGRAPH-2F80F → CJK UNIFIED IDEOGRAPH-5154 # + +2F810 ; 5164 ; MA # ( 兤 → 兤 ) CJK COMPATIBILITY IDEOGRAPH-2F810 → CJK UNIFIED IDEOGRAPH-5164 # + +2F0A ; 5165 ; MA #* ( ⼊ → 入 ) KANGXI RADICAL ENTER → CJK UNIFIED IDEOGRAPH-5165 # + +2F814 ; 5167 ; MA # ( 內 → 內 ) CJK COMPATIBILITY IDEOGRAPH-2F814 → CJK UNIFIED IDEOGRAPH-5167 # + +FA72 ; 5168 ; MA # ( 全 → 全 ) CJK COMPATIBILITY IDEOGRAPH-FA72 → CJK UNIFIED IDEOGRAPH-5168 # + +F978 ; 5169 ; MA # ( 兩 → 兩 ) CJK COMPATIBILITY IDEOGRAPH-F978 → CJK UNIFIED IDEOGRAPH-5169 # + +30CF ; 516B ; MA # ( ハ → 八 ) KATAKANA LETTER HA → CJK UNIFIED IDEOGRAPH-516B # +2F0B ; 516B ; MA #* ( ⼋ → 八 ) KANGXI RADICAL EIGHT → CJK UNIFIED IDEOGRAPH-516B # + +F9D1 ; 516D ; MA # ( 六 → 六 ) CJK COMPATIBILITY IDEOGRAPH-F9D1 → CJK UNIFIED IDEOGRAPH-516D # + +2F811 ; 5177 ; MA # ( 具 → 具 ) CJK COMPATIBILITY IDEOGRAPH-2F811 → CJK UNIFIED IDEOGRAPH-5177 # + +2F812 ; 2051C ; MA # ( 𠔜 → 𠔜 ) CJK COMPATIBILITY IDEOGRAPH-2F812 → CJK UNIFIED IDEOGRAPH-2051C # + +2F91B ; 20525 ; MA # ( 𠔥 → 𠔥 ) CJK COMPATIBILITY IDEOGRAPH-2F91B → CJK UNIFIED IDEOGRAPH-20525 # + +FA75 ; 5180 ; MA # ( 冀 → 冀 ) CJK COMPATIBILITY IDEOGRAPH-FA75 → CJK UNIFIED IDEOGRAPH-5180 # + +2F813 ; 34B9 ; MA # ( 㒹 → 㒹 ) CJK COMPATIBILITY IDEOGRAPH-2F813 → CJK UNIFIED IDEOGRAPH-34B9 # + +2F0C ; 5182 ; MA #* ( ⼌ → 冂 ) KANGXI RADICAL DOWN BOX → CJK UNIFIED IDEOGRAPH-5182 # + +2F815 ; 518D ; MA # ( 再 → 再 ) CJK COMPATIBILITY IDEOGRAPH-2F815 → CJK UNIFIED IDEOGRAPH-518D # + +2F816 ; 2054B ; MA # ( 𠕋 → 𠕋 ) CJK COMPATIBILITY IDEOGRAPH-2F816 → CJK UNIFIED IDEOGRAPH-2054B # + +2F8D2 ; 5192 ; MA # ( 冒 → 冒 ) CJK COMPATIBILITY IDEOGRAPH-2F8D2 → CJK UNIFIED IDEOGRAPH-5192 # + +2F8D3 ; 5195 ; MA # ( 冕 → 冕 ) CJK COMPATIBILITY IDEOGRAPH-2F8D3 → CJK UNIFIED IDEOGRAPH-5195 # + +2F9CA ; 34BB ; MA # ( 㒻 → 㒻 ) CJK COMPATIBILITY IDEOGRAPH-2F9CA → CJK UNIFIED IDEOGRAPH-34BB # + +2F8D4 ; 6700 ; MA # ( 最 → 最 ) CJK COMPATIBILITY IDEOGRAPH-2F8D4 → CJK UNIFIED IDEOGRAPH-6700 # + +2F0D ; 5196 ; MA #* ( ⼍ → 冖 ) KANGXI RADICAL COVER → CJK UNIFIED IDEOGRAPH-5196 # + +2F817 ; 5197 ; MA # ( 冗 → 冗 ) CJK COMPATIBILITY IDEOGRAPH-2F817 → CJK UNIFIED IDEOGRAPH-5197 # + +2F818 ; 51A4 ; MA # ( 冤 → 冤 ) CJK COMPATIBILITY IDEOGRAPH-2F818 → CJK UNIFIED IDEOGRAPH-51A4 # + +2F0E ; 51AB ; MA #* ( ⼎ → 冫 ) KANGXI RADICAL ICE → CJK UNIFIED IDEOGRAPH-51AB # + +2F81A ; 51AC ; MA # ( 冬 → 冬 ) CJK COMPATIBILITY IDEOGRAPH-2F81A → CJK UNIFIED IDEOGRAPH-51AC # + +FA71 ; 51B5 ; MA # ( 况 → 况 ) CJK COMPATIBILITY IDEOGRAPH-FA71 → CJK UNIFIED IDEOGRAPH-51B5 # +2F81B ; 51B5 ; MA # ( 况 → 况 ) CJK COMPATIBILITY IDEOGRAPH-2F81B → CJK UNIFIED IDEOGRAPH-51B5 # + +F92E ; 51B7 ; MA # ( 冷 → 冷 ) CJK COMPATIBILITY IDEOGRAPH-F92E → CJK UNIFIED IDEOGRAPH-51B7 # + +F979 ; 51C9 ; MA # ( 凉 → 凉 ) CJK COMPATIBILITY IDEOGRAPH-F979 → CJK UNIFIED IDEOGRAPH-51C9 # + +F955 ; 51CC ; MA # ( 凌 → 凌 ) CJK COMPATIBILITY IDEOGRAPH-F955 → CJK UNIFIED IDEOGRAPH-51CC # + +F954 ; 51DC ; MA # ( 凜 → 凜 ) CJK COMPATIBILITY IDEOGRAPH-F954 → CJK UNIFIED IDEOGRAPH-51DC # + +FA15 ; 51DE ; MA # ( 凞 → 凞 ) CJK COMPATIBILITY IDEOGRAPH-FA15 → CJK UNIFIED IDEOGRAPH-51DE # + +2F0F ; 51E0 ; MA #* ( ⼏ → 几 ) KANGXI RADICAL TABLE → CJK UNIFIED IDEOGRAPH-51E0 # + +2F80D ; 2063A ; MA # ( 𠘺 → 𠘺 ) CJK COMPATIBILITY IDEOGRAPH-2F80D → CJK UNIFIED IDEOGRAPH-2063A # + +2F81D ; 51F5 ; MA # ( 凵 → 凵 ) CJK COMPATIBILITY IDEOGRAPH-2F81D → CJK UNIFIED IDEOGRAPH-51F5 # +2F10 ; 51F5 ; MA #* ( ⼐ → 凵 ) KANGXI RADICAL OPEN BOX → CJK UNIFIED IDEOGRAPH-51F5 # +20674 ; 51F5 ; MA # ( 𠙴 → 凵 ) CJK UNIFIED IDEOGRAPH-20674 → CJK UNIFIED IDEOGRAPH-51F5 # →凵→ + +2F11 ; 5200 ; MA #* ( ⼑ → 刀 ) KANGXI RADICAL KNIFE → CJK UNIFIED IDEOGRAPH-5200 # + +2E89 ; 5202 ; MA #* ( ⺉ → 刂 ) CJK RADICAL KNIFE TWO → CJK UNIFIED IDEOGRAPH-5202 # + +2F81E ; 5203 ; MA # ( 刃 → 刃 ) CJK COMPATIBILITY IDEOGRAPH-2F81E → CJK UNIFIED IDEOGRAPH-5203 # + +FA00 ; 5207 ; MA # ( 切 → 切 ) CJK COMPATIBILITY IDEOGRAPH-FA00 → CJK UNIFIED IDEOGRAPH-5207 # +2F850 ; 5207 ; MA # ( 切 → 切 ) CJK COMPATIBILITY IDEOGRAPH-2F850 → CJK UNIFIED IDEOGRAPH-5207 # + +F99C ; 5217 ; MA # ( 列 → 列 ) CJK COMPATIBILITY IDEOGRAPH-F99C → CJK UNIFIED IDEOGRAPH-5217 # + +F9DD ; 5229 ; MA # ( 利 → 利 ) CJK COMPATIBILITY IDEOGRAPH-F9DD → CJK UNIFIED IDEOGRAPH-5229 # + +2F81F ; 34DF ; MA # ( 㓟 → 㓟 ) CJK COMPATIBILITY IDEOGRAPH-2F81F → CJK UNIFIED IDEOGRAPH-34DF # + +F9FF ; 523A ; MA # ( 刺 → 刺 ) CJK COMPATIBILITY IDEOGRAPH-F9FF → CJK UNIFIED IDEOGRAPH-523A # + +2F820 ; 523B ; MA # ( 刻 → 刻 ) CJK COMPATIBILITY IDEOGRAPH-2F820 → CJK UNIFIED IDEOGRAPH-523B # + +2F821 ; 5246 ; MA # ( 剆 → 剆 ) CJK COMPATIBILITY IDEOGRAPH-2F821 → CJK UNIFIED IDEOGRAPH-5246 # + +2F822 ; 5272 ; MA # ( 割 → 割 ) CJK COMPATIBILITY IDEOGRAPH-2F822 → CJK UNIFIED IDEOGRAPH-5272 # + +2F823 ; 5277 ; MA # ( 剷 → 剷 ) CJK COMPATIBILITY IDEOGRAPH-2F823 → CJK UNIFIED IDEOGRAPH-5277 # + +F9C7 ; 5289 ; MA # ( 劉 → 劉 ) CJK COMPATIBILITY IDEOGRAPH-F9C7 → CJK UNIFIED IDEOGRAPH-5289 # + +2F9D9 ; 20804 ; MA # ( 𠠄 → 𠠄 ) CJK COMPATIBILITY IDEOGRAPH-2F9D9 → CJK UNIFIED IDEOGRAPH-20804 # + +30AB ; 529B ; MA # ( カ → 力 ) KATAKANA LETTER KA → CJK UNIFIED IDEOGRAPH-529B # →⼒→ +F98A ; 529B ; MA # ( 力 → 力 ) CJK COMPATIBILITY IDEOGRAPH-F98A → CJK UNIFIED IDEOGRAPH-529B # +2F12 ; 529B ; MA #* ( ⼒ → 力 ) KANGXI RADICAL POWER → CJK UNIFIED IDEOGRAPH-529B # + +F99D ; 52A3 ; MA # ( 劣 → 劣 ) CJK COMPATIBILITY IDEOGRAPH-F99D → CJK UNIFIED IDEOGRAPH-52A3 # + +2F824 ; 3515 ; MA # ( 㔕 → 㔕 ) CJK COMPATIBILITY IDEOGRAPH-2F824 → CJK UNIFIED IDEOGRAPH-3515 # + +2F992 ; 52B3 ; MA # ( 劳 → 劳 ) CJK COMPATIBILITY IDEOGRAPH-2F992 → CJK UNIFIED IDEOGRAPH-52B3 # + +FA76 ; 52C7 ; MA # ( 勇 → 勇 ) CJK COMPATIBILITY IDEOGRAPH-FA76 → CJK UNIFIED IDEOGRAPH-52C7 # +2F825 ; 52C7 ; MA # ( 勇 → 勇 ) CJK COMPATIBILITY IDEOGRAPH-2F825 → CJK UNIFIED IDEOGRAPH-52C7 # + +FA33 ; 52C9 ; MA # ( 勉 → 勉 ) CJK COMPATIBILITY IDEOGRAPH-FA33 → CJK UNIFIED IDEOGRAPH-52C9 # +2F826 ; 52C9 ; MA # ( 勉 → 勉 ) CJK COMPATIBILITY IDEOGRAPH-2F826 → CJK UNIFIED IDEOGRAPH-52C9 # + +F952 ; 52D2 ; MA # ( 勒 → 勒 ) CJK COMPATIBILITY IDEOGRAPH-F952 → CJK UNIFIED IDEOGRAPH-52D2 # + +F92F ; 52DE ; MA # ( 勞 → 勞 ) CJK COMPATIBILITY IDEOGRAPH-F92F → CJK UNIFIED IDEOGRAPH-52DE # + +FA34 ; 52E4 ; MA # ( 勤 → 勤 ) CJK COMPATIBILITY IDEOGRAPH-FA34 → CJK UNIFIED IDEOGRAPH-52E4 # +2F827 ; 52E4 ; MA # ( 勤 → 勤 ) CJK COMPATIBILITY IDEOGRAPH-2F827 → CJK UNIFIED IDEOGRAPH-52E4 # + +F97F ; 52F5 ; MA # ( 勵 → 勵 ) CJK COMPATIBILITY IDEOGRAPH-F97F → CJK UNIFIED IDEOGRAPH-52F5 # + +2F13 ; 52F9 ; MA #* ( ⼓ → 勹 ) KANGXI RADICAL WRAP → CJK UNIFIED IDEOGRAPH-52F9 # + +FA77 ; 52FA ; MA # ( 勺 → 勺 ) CJK COMPATIBILITY IDEOGRAPH-FA77 → CJK UNIFIED IDEOGRAPH-52FA # +2F828 ; 52FA ; MA # ( 勺 → 勺 ) CJK COMPATIBILITY IDEOGRAPH-2F828 → CJK UNIFIED IDEOGRAPH-52FA # + +2F829 ; 5305 ; MA # ( 包 → 包 ) CJK COMPATIBILITY IDEOGRAPH-2F829 → CJK UNIFIED IDEOGRAPH-5305 # + +2F82A ; 5306 ; MA # ( 匆 → 匆 ) CJK COMPATIBILITY IDEOGRAPH-2F82A → CJK UNIFIED IDEOGRAPH-5306 # + +2F9DD ; 208DE ; MA # ( 𠣞 → 𠣞 ) CJK COMPATIBILITY IDEOGRAPH-2F9DD → CJK UNIFIED IDEOGRAPH-208DE # + +2F14 ; 5315 ; MA #* ( ⼔ → 匕 ) KANGXI RADICAL SPOON → CJK UNIFIED IDEOGRAPH-5315 # + +F963 ; 5317 ; MA # ( 北 → 北 ) CJK COMPATIBILITY IDEOGRAPH-F963 → CJK UNIFIED IDEOGRAPH-5317 # +2F82B ; 5317 ; MA # ( 北 → 北 ) CJK COMPATIBILITY IDEOGRAPH-2F82B → CJK UNIFIED IDEOGRAPH-5317 # + +2F15 ; 531A ; MA #* ( ⼕ → 匚 ) KANGXI RADICAL RIGHT OPEN BOX → CJK UNIFIED IDEOGRAPH-531A # + +2F16 ; 5338 ; MA #* ( ⼖ → 匸 ) KANGXI RADICAL HIDING ENCLOSURE → CJK UNIFIED IDEOGRAPH-5338 # + +F9EB ; 533F ; MA # ( 匿 → 匿 ) CJK COMPATIBILITY IDEOGRAPH-F9EB → CJK UNIFIED IDEOGRAPH-533F # + +2F17 ; 5341 ; MA #* ( ⼗ → 十 ) KANGXI RADICAL TEN → CJK UNIFIED IDEOGRAPH-5341 # +3038 ; 5341 ; MA # ( 〸 → 十 ) HANGZHOU NUMERAL TEN → CJK UNIFIED IDEOGRAPH-5341 # + +303A ; 5345 ; MA # ( 〺 → 卅 ) HANGZHOU NUMERAL THIRTY → CJK UNIFIED IDEOGRAPH-5345 # + +2F82C ; 5349 ; MA # ( 卉 → 卉 ) CJK COMPATIBILITY IDEOGRAPH-2F82C → CJK UNIFIED IDEOGRAPH-5349 # + +0FD6 ; 534D ; MA #* ( ࿖ → 卍 ) LEFT-FACING SVASTI SIGN → CJK UNIFIED IDEOGRAPH-534D # + +0FD5 ; 5350 ; MA #* ( ࿕ → 卐 ) RIGHT-FACING SVASTI SIGN → CJK UNIFIED IDEOGRAPH-5350 # + +FA35 ; 5351 ; MA # ( 卑 → 卑 ) CJK COMPATIBILITY IDEOGRAPH-FA35 → CJK UNIFIED IDEOGRAPH-5351 # +2F82D ; 5351 ; MA # ( 卑 → 卑 ) CJK COMPATIBILITY IDEOGRAPH-2F82D → CJK UNIFIED IDEOGRAPH-5351 # +2D161 ; 5351 ; MA # ( 𭅡 → 卑 ) CJK UNIFIED IDEOGRAPH-2D161 → CJK UNIFIED IDEOGRAPH-5351 # →卑→ + +2F82E ; 535A ; MA # ( 博 → 博 ) CJK COMPATIBILITY IDEOGRAPH-2F82E → CJK UNIFIED IDEOGRAPH-535A # + +30C8 ; 535C ; MA # ( ト → 卜 ) KATAKANA LETTER TO → CJK UNIFIED IDEOGRAPH-535C # →⼘→ +2F18 ; 535C ; MA #* ( ⼘ → 卜 ) KANGXI RADICAL DIVINATION → CJK UNIFIED IDEOGRAPH-535C # + +2F19 ; 5369 ; MA #* ( ⼙ → 卩 ) KANGXI RADICAL SEAL → CJK UNIFIED IDEOGRAPH-5369 # + +2E8B ; 353E ; MA #* ( ⺋ → 㔾 ) CJK RADICAL SEAL → CJK UNIFIED IDEOGRAPH-353E # + +2F82F ; 5373 ; MA # ( 即 → 即 ) CJK COMPATIBILITY IDEOGRAPH-2F82F → CJK UNIFIED IDEOGRAPH-5373 # + +F91C ; 5375 ; MA # ( 卵 → 卵 ) CJK COMPATIBILITY IDEOGRAPH-F91C → CJK UNIFIED IDEOGRAPH-5375 # + +2F830 ; 537D ; MA # ( 卽 → 卽 ) CJK COMPATIBILITY IDEOGRAPH-2F830 → CJK UNIFIED IDEOGRAPH-537D # + +2F831 ; 537F ; MA # ( 卿 → 卿 ) CJK COMPATIBILITY IDEOGRAPH-2F831 → CJK UNIFIED IDEOGRAPH-537F # +2F832 ; 537F ; MA # ( 卿 → 卿 ) CJK COMPATIBILITY IDEOGRAPH-2F832 → CJK UNIFIED IDEOGRAPH-537F # +2F833 ; 537F ; MA # ( 卿 → 卿 ) CJK COMPATIBILITY IDEOGRAPH-2F833 → CJK UNIFIED IDEOGRAPH-537F # + +2F1A ; 5382 ; MA #* ( ⼚ → 厂 ) KANGXI RADICAL CLIFF → CJK UNIFIED IDEOGRAPH-5382 # + +2F834 ; 20A2C ; MA # ( 𠨬 → 𠨬 ) CJK COMPATIBILITY IDEOGRAPH-2F834 → CJK UNIFIED IDEOGRAPH-20A2C # + +2F1B ; 53B6 ; MA #* ( ⼛ → 厶 ) KANGXI RADICAL PRIVATE → CJK UNIFIED IDEOGRAPH-53B6 # + +F96B ; 53C3 ; MA # ( 參 → 參 ) CJK COMPATIBILITY IDEOGRAPH-F96B → CJK UNIFIED IDEOGRAPH-53C3 # + +2F1C ; 53C8 ; MA #* ( ⼜ → 又 ) KANGXI RADICAL AGAIN → CJK UNIFIED IDEOGRAPH-53C8 # + +2F836 ; 53CA ; MA # ( 及 → 及 ) CJK COMPATIBILITY IDEOGRAPH-2F836 → CJK UNIFIED IDEOGRAPH-53CA # + +2F837 ; 53DF ; MA # ( 叟 → 叟 ) CJK COMPATIBILITY IDEOGRAPH-2F837 → CJK UNIFIED IDEOGRAPH-53DF # + +2F838 ; 20B63 ; MA # ( 𠭣 → 𠭣 ) CJK COMPATIBILITY IDEOGRAPH-2F838 → CJK UNIFIED IDEOGRAPH-20B63 # + +30ED ; 53E3 ; MA # ( ロ → 口 ) KATAKANA LETTER RO → CJK UNIFIED IDEOGRAPH-53E3 # →⼞→→⼝→ +2F1D ; 53E3 ; MA #* ( ⼝ → 口 ) KANGXI RADICAL MOUTH → CJK UNIFIED IDEOGRAPH-53E3 # +56D7 ; 53E3 ; MA # ( 囗 → 口 ) CJK UNIFIED IDEOGRAPH-56D7 → CJK UNIFIED IDEOGRAPH-53E3 # →⼞→→⼝→ +2F1E ; 53E3 ; MA #* ( ⼞ → 口 ) KANGXI RADICAL ENCLOSURE → CJK UNIFIED IDEOGRAPH-53E3 # →⼝→ + +F906 ; 53E5 ; MA # ( 句 → 句 ) CJK COMPATIBILITY IDEOGRAPH-F906 → CJK UNIFIED IDEOGRAPH-53E5 # + +2F839 ; 53EB ; MA # ( 叫 → 叫 ) CJK COMPATIBILITY IDEOGRAPH-2F839 → CJK UNIFIED IDEOGRAPH-53EB # + +2F83A ; 53F1 ; MA # ( 叱 → 叱 ) CJK COMPATIBILITY IDEOGRAPH-2F83A → CJK UNIFIED IDEOGRAPH-53F1 # + +2F83B ; 5406 ; MA # ( 吆 → 吆 ) CJK COMPATIBILITY IDEOGRAPH-2F83B → CJK UNIFIED IDEOGRAPH-5406 # + +F9DE ; 540F ; MA # ( 吏 → 吏 ) CJK COMPATIBILITY IDEOGRAPH-F9DE → CJK UNIFIED IDEOGRAPH-540F # + +F9ED ; 541D ; MA # ( 吝 → 吝 ) CJK COMPATIBILITY IDEOGRAPH-F9ED → CJK UNIFIED IDEOGRAPH-541D # + +2F83D ; 5438 ; MA # ( 吸 → 吸 ) CJK COMPATIBILITY IDEOGRAPH-2F83D → CJK UNIFIED IDEOGRAPH-5438 # + +F980 ; 5442 ; MA # ( 呂 → 呂 ) CJK COMPATIBILITY IDEOGRAPH-F980 → CJK UNIFIED IDEOGRAPH-5442 # + +2F83E ; 5448 ; MA # ( 呈 → 呈 ) CJK COMPATIBILITY IDEOGRAPH-2F83E → CJK UNIFIED IDEOGRAPH-5448 # + +2F83F ; 5468 ; MA # ( 周 → 周 ) CJK COMPATIBILITY IDEOGRAPH-2F83F → CJK UNIFIED IDEOGRAPH-5468 # + +2F83C ; 549E ; MA # ( 咞 → 咞 ) CJK COMPATIBILITY IDEOGRAPH-2F83C → CJK UNIFIED IDEOGRAPH-549E # + +2F840 ; 54A2 ; MA # ( 咢 → 咢 ) CJK COMPATIBILITY IDEOGRAPH-2F840 → CJK UNIFIED IDEOGRAPH-54A2 # + +F99E ; 54BD ; MA # ( 咽 → 咽 ) CJK COMPATIBILITY IDEOGRAPH-F99E → CJK UNIFIED IDEOGRAPH-54BD # + +439B ; 3588 ; MA # ( 䎛 → 㖈 ) CJK UNIFIED IDEOGRAPH-439B → CJK UNIFIED IDEOGRAPH-3588 # + +2F841 ; 54F6 ; MA # ( 哶 → 哶 ) CJK COMPATIBILITY IDEOGRAPH-2F841 → CJK UNIFIED IDEOGRAPH-54F6 # + +2F842 ; 5510 ; MA # ( 唐 → 唐 ) CJK COMPATIBILITY IDEOGRAPH-2F842 → CJK UNIFIED IDEOGRAPH-5510 # + +2F843 ; 5553 ; MA # ( 啓 → 啓 ) CJK COMPATIBILITY IDEOGRAPH-2F843 → CJK UNIFIED IDEOGRAPH-5553 # +555F ; 5553 ; MA # ( 啟 → 啓 ) CJK UNIFIED IDEOGRAPH-555F → CJK UNIFIED IDEOGRAPH-5553 # + +FA79 ; 5555 ; MA # ( 啕 → 啕 ) CJK COMPATIBILITY IDEOGRAPH-FA79 → CJK UNIFIED IDEOGRAPH-5555 # + +2F844 ; 5563 ; MA # ( 啣 → 啣 ) CJK COMPATIBILITY IDEOGRAPH-2F844 → CJK UNIFIED IDEOGRAPH-5563 # + +2F845 ; 5584 ; MA # ( 善 → 善 ) CJK COMPATIBILITY IDEOGRAPH-2F845 → CJK UNIFIED IDEOGRAPH-5584 # +2F846 ; 5584 ; MA # ( 善 → 善 ) CJK COMPATIBILITY IDEOGRAPH-2F846 → CJK UNIFIED IDEOGRAPH-5584 # + +F90B ; 5587 ; MA # ( 喇 → 喇 ) CJK COMPATIBILITY IDEOGRAPH-F90B → CJK UNIFIED IDEOGRAPH-5587 # + +FA7A ; 5599 ; MA # ( 喙 → 喙 ) CJK COMPATIBILITY IDEOGRAPH-FA7A → CJK UNIFIED IDEOGRAPH-5599 # +2F847 ; 5599 ; MA # ( 喙 → 喙 ) CJK COMPATIBILITY IDEOGRAPH-2F847 → CJK UNIFIED IDEOGRAPH-5599 # + +FA36 ; 559D ; MA # ( 喝 → 喝 ) CJK COMPATIBILITY IDEOGRAPH-FA36 → CJK UNIFIED IDEOGRAPH-559D # +FA78 ; 559D ; MA # ( 喝 → 喝 ) CJK COMPATIBILITY IDEOGRAPH-FA78 → CJK UNIFIED IDEOGRAPH-559D # + +2F848 ; 55AB ; MA # ( 喫 → 喫 ) CJK COMPATIBILITY IDEOGRAPH-2F848 → CJK UNIFIED IDEOGRAPH-55AB # + +2F849 ; 55B3 ; MA # ( 喳 → 喳 ) CJK COMPATIBILITY IDEOGRAPH-2F849 → CJK UNIFIED IDEOGRAPH-55B3 # + +FA0D ; 55C0 ; MA # ( 嗀 → 嗀 ) CJK COMPATIBILITY IDEOGRAPH-FA0D → CJK UNIFIED IDEOGRAPH-55C0 # + +2F84A ; 55C2 ; MA # ( 嗂 → 嗂 ) CJK COMPATIBILITY IDEOGRAPH-2F84A → CJK UNIFIED IDEOGRAPH-55C2 # + +FA7B ; 55E2 ; MA # ( 嗢 → 嗢 ) CJK COMPATIBILITY IDEOGRAPH-FA7B → CJK UNIFIED IDEOGRAPH-55E2 # + +FA37 ; 5606 ; MA # ( 嘆 → 嘆 ) CJK COMPATIBILITY IDEOGRAPH-FA37 → CJK UNIFIED IDEOGRAPH-5606 # +2F84C ; 5606 ; MA # ( 嘆 → 嘆 ) CJK COMPATIBILITY IDEOGRAPH-2F84C → CJK UNIFIED IDEOGRAPH-5606 # + +2F84E ; 5651 ; MA # ( 噑 → 噑 ) CJK COMPATIBILITY IDEOGRAPH-2F84E → CJK UNIFIED IDEOGRAPH-5651 # + +2F84F ; 5674 ; MA # ( 噴 → 噴 ) CJK COMPATIBILITY IDEOGRAPH-2F84F → CJK UNIFIED IDEOGRAPH-5674 # + +FA38 ; 5668 ; MA # ( 器 → 器 ) CJK COMPATIBILITY IDEOGRAPH-FA38 → CJK UNIFIED IDEOGRAPH-5668 # + +F9A9 ; 56F9 ; MA # ( 囹 → 囹 ) CJK COMPATIBILITY IDEOGRAPH-F9A9 → CJK UNIFIED IDEOGRAPH-56F9 # + +2F84B ; 5716 ; MA # ( 圖 → 圖 ) CJK COMPATIBILITY IDEOGRAPH-2F84B → CJK UNIFIED IDEOGRAPH-5716 # + +2F84D ; 5717 ; MA # ( 圗 → 圗 ) CJK COMPATIBILITY IDEOGRAPH-2F84D → CJK UNIFIED IDEOGRAPH-5717 # + +2F1F ; 571F ; MA #* ( ⼟ → 土 ) KANGXI RADICAL EARTH → CJK UNIFIED IDEOGRAPH-571F # +58EB ; 571F ; MA # ( 士 → 土 ) CJK UNIFIED IDEOGRAPH-58EB → CJK UNIFIED IDEOGRAPH-571F # →⼠→→⼟→ +2F20 ; 571F ; MA #* ( ⼠ → 土 ) KANGXI RADICAL SCHOLAR → CJK UNIFIED IDEOGRAPH-571F # →⼟→ + +2F855 ; 578B ; MA # ( 型 → 型 ) CJK COMPATIBILITY IDEOGRAPH-2F855 → CJK UNIFIED IDEOGRAPH-578B # + +2F852 ; 57CE ; MA # ( 城 → 城 ) CJK COMPATIBILITY IDEOGRAPH-2F852 → CJK UNIFIED IDEOGRAPH-57CE # + +39B3 ; 363D ; MA # ( 㦳 → 㘽 ) CJK UNIFIED IDEOGRAPH-39B3 → CJK UNIFIED IDEOGRAPH-363D # + +2F853 ; 57F4 ; MA # ( 埴 → 埴 ) CJK COMPATIBILITY IDEOGRAPH-2F853 → CJK UNIFIED IDEOGRAPH-57F4 # + +2F854 ; 580D ; MA # ( 堍 → 堍 ) CJK COMPATIBILITY IDEOGRAPH-2F854 → CJK UNIFIED IDEOGRAPH-580D # + +2F857 ; 5831 ; MA # ( 報 → 報 ) CJK COMPATIBILITY IDEOGRAPH-2F857 → CJK UNIFIED IDEOGRAPH-5831 # + +2F856 ; 5832 ; MA # ( 堲 → 堲 ) CJK COMPATIBILITY IDEOGRAPH-2F856 → CJK UNIFIED IDEOGRAPH-5832 # + +FA39 ; 5840 ; MA # ( 塀 → 塀 ) CJK COMPATIBILITY IDEOGRAPH-FA39 → CJK UNIFIED IDEOGRAPH-5840 # + +FA10 ; 585A ; MA # ( 塚 → 塚 ) CJK COMPATIBILITY IDEOGRAPH-FA10 → CJK UNIFIED IDEOGRAPH-585A # +FA7C ; 585A ; MA # ( 塚 → 塚 ) CJK COMPATIBILITY IDEOGRAPH-FA7C → CJK UNIFIED IDEOGRAPH-585A # + +F96C ; 585E ; MA # ( 塞 → 塞 ) CJK COMPATIBILITY IDEOGRAPH-F96C → CJK UNIFIED IDEOGRAPH-585E # + +586B ; 5861 ; MA # ( 填 → 塡 ) CJK UNIFIED IDEOGRAPH-586B → CJK UNIFIED IDEOGRAPH-5861 # + +58FF ; 58AB ; MA # ( 壿 → 墫 ) CJK UNIFIED IDEOGRAPH-58FF → CJK UNIFIED IDEOGRAPH-58AB # + +2F858 ; 58AC ; MA # ( 墬 → 墬 ) CJK COMPATIBILITY IDEOGRAPH-2F858 → CJK UNIFIED IDEOGRAPH-58AC # + +FA7D ; 58B3 ; MA # ( 墳 → 墳 ) CJK COMPATIBILITY IDEOGRAPH-FA7D → CJK UNIFIED IDEOGRAPH-58B3 # + +F94A ; 58D8 ; MA # ( 壘 → 壘 ) CJK COMPATIBILITY IDEOGRAPH-F94A → CJK UNIFIED IDEOGRAPH-58D8 # + +F942 ; 58DF ; MA # ( 壟 → 壟 ) CJK COMPATIBILITY IDEOGRAPH-F942 → CJK UNIFIED IDEOGRAPH-58DF # + +2F859 ; 214E4 ; MA # ( 𡓤 → 𡓤 ) CJK COMPATIBILITY IDEOGRAPH-2F859 → CJK UNIFIED IDEOGRAPH-214E4 # + +2F851 ; 58EE ; MA # ( 壮 → 壮 ) CJK COMPATIBILITY IDEOGRAPH-2F851 → CJK UNIFIED IDEOGRAPH-58EE # + +2F85A ; 58F2 ; MA # ( 売 → 売 ) CJK COMPATIBILITY IDEOGRAPH-2F85A → CJK UNIFIED IDEOGRAPH-58F2 # + +2F85B ; 58F7 ; MA # ( 壷 → 壷 ) CJK COMPATIBILITY IDEOGRAPH-2F85B → CJK UNIFIED IDEOGRAPH-58F7 # +21533 ; 58F7 ; MA # ( 𡔳 → 壷 ) CJK UNIFIED IDEOGRAPH-21533 → CJK UNIFIED IDEOGRAPH-58F7 # →壷→ + +2F21 ; 5902 ; MA #* ( ⼡ → 夂 ) KANGXI RADICAL GO → CJK UNIFIED IDEOGRAPH-5902 # + +2F85C ; 5906 ; MA # ( 夆 → 夆 ) CJK COMPATIBILITY IDEOGRAPH-2F85C → CJK UNIFIED IDEOGRAPH-5906 # + +2F22 ; 590A ; MA #* ( ⼢ → 夊 ) KANGXI RADICAL GO SLOWLY → CJK UNIFIED IDEOGRAPH-590A # + +30BF ; 5915 ; MA # ( タ → 夕 ) KATAKANA LETTER TA → CJK UNIFIED IDEOGRAPH-5915 # →⼣→ +2F23 ; 5915 ; MA #* ( ⼣ → 夕 ) KANGXI RADICAL EVENING → CJK UNIFIED IDEOGRAPH-5915 # + +2F85D ; 591A ; MA # ( 多 → 多 ) CJK COMPATIBILITY IDEOGRAPH-2F85D → CJK UNIFIED IDEOGRAPH-591A # +21587 ; 591A ; MA # ( 𡖇 → 多 ) CJK UNIFIED IDEOGRAPH-21587 → CJK UNIFIED IDEOGRAPH-591A # →多→ + +2F85E ; 5922 ; MA # ( 夢 → 夢 ) CJK COMPATIBILITY IDEOGRAPH-2F85E → CJK UNIFIED IDEOGRAPH-5922 # + +2F24 ; 5927 ; MA #* ( ⼤ → 大 ) KANGXI RADICAL BIG → CJK UNIFIED IDEOGRAPH-5927 # + +FA7E ; 5944 ; MA # ( 奄 → 奄 ) CJK COMPATIBILITY IDEOGRAPH-FA7E → CJK UNIFIED IDEOGRAPH-5944 # + +F90C ; 5948 ; MA # ( 奈 → 奈 ) CJK COMPATIBILITY IDEOGRAPH-F90C → CJK UNIFIED IDEOGRAPH-5948 # + +FA7F ; 5954 ; MA # ( 奔 → 奔 ) CJK COMPATIBILITY IDEOGRAPH-FA7F → CJK UNIFIED IDEOGRAPH-5954 # + +F909 ; 5951 ; MA # ( 契 → 契 ) CJK COMPATIBILITY IDEOGRAPH-F909 → CJK UNIFIED IDEOGRAPH-5951 # + +2F85F ; 5962 ; MA # ( 奢 → 奢 ) CJK COMPATIBILITY IDEOGRAPH-2F85F → CJK UNIFIED IDEOGRAPH-5962 # + +F981 ; 5973 ; MA # ( 女 → 女 ) CJK COMPATIBILITY IDEOGRAPH-F981 → CJK UNIFIED IDEOGRAPH-5973 # +2F25 ; 5973 ; MA #* ( ⼥ → 女 ) KANGXI RADICAL WOMAN → CJK UNIFIED IDEOGRAPH-5973 # + +216A7 ; 216A8 ; MA # ( 𡚧 → 𡚨 ) CJK UNIFIED IDEOGRAPH-216A7 → CJK UNIFIED IDEOGRAPH-216A8 # →𡚨→ +2F860 ; 216A8 ; MA # ( 𡚨 → 𡚨 ) CJK COMPATIBILITY IDEOGRAPH-2F860 → CJK UNIFIED IDEOGRAPH-216A8 # + +2F861 ; 216EA ; MA # ( 𡛪 → 𡛪 ) CJK COMPATIBILITY IDEOGRAPH-2F861 → CJK UNIFIED IDEOGRAPH-216EA # + +2F865 ; 59D8 ; MA # ( 姘 → 姘 ) CJK COMPATIBILITY IDEOGRAPH-2F865 → CJK UNIFIED IDEOGRAPH-59D8 # + +2F862 ; 59EC ; MA # ( 姬 → 姬 ) CJK COMPATIBILITY IDEOGRAPH-2F862 → CJK UNIFIED IDEOGRAPH-59EC # + +2F863 ; 5A1B ; MA # ( 娛 → 娛 ) CJK COMPATIBILITY IDEOGRAPH-2F863 → CJK UNIFIED IDEOGRAPH-5A1B # + +2F864 ; 5A27 ; MA # ( 娧 → 娧 ) CJK COMPATIBILITY IDEOGRAPH-2F864 → CJK UNIFIED IDEOGRAPH-5A27 # + +FA80 ; 5A62 ; MA # ( 婢 → 婢 ) CJK COMPATIBILITY IDEOGRAPH-FA80 → CJK UNIFIED IDEOGRAPH-5A62 # + +2F866 ; 5A66 ; MA # ( 婦 → 婦 ) CJK COMPATIBILITY IDEOGRAPH-2F866 → CJK UNIFIED IDEOGRAPH-5A66 # + +5B00 ; 5AAF ; MA # ( 嬀 → 媯 ) CJK UNIFIED IDEOGRAPH-5B00 → CJK UNIFIED IDEOGRAPH-5AAF # + +2F867 ; 36EE ; MA # ( 㛮 → 㛮 ) CJK COMPATIBILITY IDEOGRAPH-2F867 → CJK UNIFIED IDEOGRAPH-36EE # + +2F868 ; 36FC ; MA # ( 㛼 → 㛼 ) CJK COMPATIBILITY IDEOGRAPH-2F868 → CJK UNIFIED IDEOGRAPH-36FC # + +2F986 ; 5AB5 ; MA # ( 媵 → 媵 ) CJK COMPATIBILITY IDEOGRAPH-2F986 → CJK UNIFIED IDEOGRAPH-5AB5 # + +2F869 ; 5B08 ; MA # ( 嬈 → 嬈 ) CJK COMPATIBILITY IDEOGRAPH-2F869 → CJK UNIFIED IDEOGRAPH-5B08 # + +FA81 ; 5B28 ; MA # ( 嬨 → 嬨 ) CJK COMPATIBILITY IDEOGRAPH-FA81 → CJK UNIFIED IDEOGRAPH-5B28 # + +2F86A ; 5B3E ; MA # ( 嬾 → 嬾 ) CJK COMPATIBILITY IDEOGRAPH-2F86A → CJK UNIFIED IDEOGRAPH-5B3E # +2F86B ; 5B3E ; MA # ( 嬾 → 嬾 ) CJK COMPATIBILITY IDEOGRAPH-2F86B → CJK UNIFIED IDEOGRAPH-5B3E # + +2F26 ; 5B50 ; MA #* ( ⼦ → 子 ) KANGXI RADICAL CHILD → CJK UNIFIED IDEOGRAPH-5B50 # + +2F27 ; 5B80 ; MA #* ( ⼧ → 宀 ) KANGXI RADICAL ROOF → CJK UNIFIED IDEOGRAPH-5B80 # + +FA04 ; 5B85 ; MA # ( 宅 → 宅 ) CJK COMPATIBILITY IDEOGRAPH-FA04 → CJK UNIFIED IDEOGRAPH-5B85 # + +2F86C ; 219C8 ; MA # ( 𡧈 → 𡧈 ) CJK COMPATIBILITY IDEOGRAPH-2F86C → CJK UNIFIED IDEOGRAPH-219C8 # + +2F86D ; 5BC3 ; MA # ( 寃 → 寃 ) CJK COMPATIBILITY IDEOGRAPH-2F86D → CJK UNIFIED IDEOGRAPH-5BC3 # + +2F86E ; 5BD8 ; MA # ( 寘 → 寘 ) CJK COMPATIBILITY IDEOGRAPH-2F86E → CJK UNIFIED IDEOGRAPH-5BD8 # + +F95F ; 5BE7 ; MA # ( 寧 → 寧 ) CJK COMPATIBILITY IDEOGRAPH-F95F → CJK UNIFIED IDEOGRAPH-5BE7 # +F9AA ; 5BE7 ; MA # ( 寧 → 寧 ) CJK COMPATIBILITY IDEOGRAPH-F9AA → CJK UNIFIED IDEOGRAPH-5BE7 # +2F86F ; 5BE7 ; MA # ( 寧 → 寧 ) CJK COMPATIBILITY IDEOGRAPH-2F86F → CJK UNIFIED IDEOGRAPH-5BE7 # + +F9BC ; 5BEE ; MA # ( 寮 → 寮 ) CJK COMPATIBILITY IDEOGRAPH-F9BC → CJK UNIFIED IDEOGRAPH-5BEE # + +2F870 ; 5BF3 ; MA # ( 寳 → 寳 ) CJK COMPATIBILITY IDEOGRAPH-2F870 → CJK UNIFIED IDEOGRAPH-5BF3 # + +2F871 ; 21B18 ; MA # ( 𡬘 → 𡬘 ) CJK COMPATIBILITY IDEOGRAPH-2F871 → CJK UNIFIED IDEOGRAPH-21B18 # + +2F28 ; 5BF8 ; MA #* ( ⼨ → 寸 ) KANGXI RADICAL INCH → CJK UNIFIED IDEOGRAPH-5BF8 # + +2F872 ; 5BFF ; MA # ( 寿 → 寿 ) CJK COMPATIBILITY IDEOGRAPH-2F872 → CJK UNIFIED IDEOGRAPH-5BFF # + +2F873 ; 5C06 ; MA # ( 将 → 将 ) CJK COMPATIBILITY IDEOGRAPH-2F873 → CJK UNIFIED IDEOGRAPH-5C06 # + +2F29 ; 5C0F ; MA #* ( ⼩ → 小 ) KANGXI RADICAL SMALL → CJK UNIFIED IDEOGRAPH-5C0F # + +2F875 ; 5C22 ; MA # ( 尢 → 尢 ) CJK COMPATIBILITY IDEOGRAPH-2F875 → CJK UNIFIED IDEOGRAPH-5C22 # +2E90 ; 5C22 ; MA #* ( ⺐ → 尢 ) CJK RADICAL LAME THREE → CJK UNIFIED IDEOGRAPH-5C22 # +2F2A ; 5C22 ; MA #* ( ⼪ → 尢 ) KANGXI RADICAL LAME → CJK UNIFIED IDEOGRAPH-5C22 # + +2E8F ; 5C23 ; MA #* ( ⺏ → 尣 ) CJK RADICAL LAME TWO → CJK UNIFIED IDEOGRAPH-5C23 # + +2F876 ; 3781 ; MA # ( 㞁 → 㞁 ) CJK COMPATIBILITY IDEOGRAPH-2F876 → CJK UNIFIED IDEOGRAPH-3781 # + +2F2B ; 5C38 ; MA #* ( ⼫ → 尸 ) KANGXI RADICAL CORPSE → CJK UNIFIED IDEOGRAPH-5C38 # + +F9BD ; 5C3F ; MA # ( 尿 → 尿 ) CJK COMPATIBILITY IDEOGRAPH-F9BD → CJK UNIFIED IDEOGRAPH-5C3F # + +2F877 ; 5C60 ; MA # ( 屠 → 屠 ) CJK COMPATIBILITY IDEOGRAPH-2F877 → CJK UNIFIED IDEOGRAPH-5C60 # + +F94B ; 5C62 ; MA # ( 屢 → 屢 ) CJK COMPATIBILITY IDEOGRAPH-F94B → CJK UNIFIED IDEOGRAPH-5C62 # + +FA3B ; 5C64 ; MA # ( 層 → 層 ) CJK COMPATIBILITY IDEOGRAPH-FA3B → CJK UNIFIED IDEOGRAPH-5C64 # + +F9DF ; 5C65 ; MA # ( 履 → 履 ) CJK COMPATIBILITY IDEOGRAPH-F9DF → CJK UNIFIED IDEOGRAPH-5C65 # + +FA3C ; 5C6E ; MA # ( 屮 → 屮 ) CJK COMPATIBILITY IDEOGRAPH-FA3C → CJK UNIFIED IDEOGRAPH-5C6E # +2F878 ; 5C6E ; MA # ( 屮 → 屮 ) CJK COMPATIBILITY IDEOGRAPH-2F878 → CJK UNIFIED IDEOGRAPH-5C6E # +2F2C ; 5C6E ; MA #* ( ⼬ → 屮 ) KANGXI RADICAL SPROUT → CJK UNIFIED IDEOGRAPH-5C6E # + +2F8F8 ; 21D0B ; MA # ( 𡴋 → 𡴋 ) CJK COMPATIBILITY IDEOGRAPH-2F8F8 → CJK UNIFIED IDEOGRAPH-21D0B # + +2F2D ; 5C71 ; MA #* ( ⼭ → 山 ) KANGXI RADICAL MOUNTAIN → CJK UNIFIED IDEOGRAPH-5C71 # + +2F879 ; 5CC0 ; MA # ( 峀 → 峀 ) CJK COMPATIBILITY IDEOGRAPH-2F879 → CJK UNIFIED IDEOGRAPH-5CC0 # +2B73A ; 5CC0 ; MA # ( 𫜺 → 峀 ) CJK UNIFIED IDEOGRAPH-2B73A → CJK UNIFIED IDEOGRAPH-5CC0 # + +2F87A ; 5C8D ; MA # ( 岍 → 岍 ) CJK COMPATIBILITY IDEOGRAPH-2F87A → CJK UNIFIED IDEOGRAPH-5C8D # + +2F87B ; 21DE4 ; MA # ( 𡷤 → 𡷤 ) CJK COMPATIBILITY IDEOGRAPH-2F87B → CJK UNIFIED IDEOGRAPH-21DE4 # + +2F87D ; 21DE6 ; MA # ( 𡷦 → 𡷦 ) CJK COMPATIBILITY IDEOGRAPH-2F87D → CJK UNIFIED IDEOGRAPH-21DE6 # + +F9D5 ; 5D19 ; MA # ( 崙 → 崙 ) CJK COMPATIBILITY IDEOGRAPH-F9D5 → CJK UNIFIED IDEOGRAPH-5D19 # + +2F87C ; 5D43 ; MA # ( 嵃 → 嵃 ) CJK COMPATIBILITY IDEOGRAPH-2F87C → CJK UNIFIED IDEOGRAPH-5D43 # + +F921 ; 5D50 ; MA # ( 嵐 → 嵐 ) CJK COMPATIBILITY IDEOGRAPH-F921 → CJK UNIFIED IDEOGRAPH-5D50 # + +2F87F ; 5D6B ; MA # ( 嵫 → 嵫 ) CJK COMPATIBILITY IDEOGRAPH-2F87F → CJK UNIFIED IDEOGRAPH-5D6B # + +2F87E ; 5D6E ; MA # ( 嵮 → 嵮 ) CJK COMPATIBILITY IDEOGRAPH-2F87E → CJK UNIFIED IDEOGRAPH-5D6E # + +2F880 ; 5D7C ; MA # ( 嵼 → 嵼 ) CJK COMPATIBILITY IDEOGRAPH-2F880 → CJK UNIFIED IDEOGRAPH-5D7C # + +2F9F4 ; 5DB2 ; MA # ( 嶲 → 嶲 ) CJK COMPATIBILITY IDEOGRAPH-2F9F4 → CJK UNIFIED IDEOGRAPH-5DB2 # + +F9AB ; 5DBA ; MA # ( 嶺 → 嶺 ) CJK COMPATIBILITY IDEOGRAPH-F9AB → CJK UNIFIED IDEOGRAPH-5DBA # + +2F2E ; 5DDB ; MA #* ( ⼮ → 巛 ) KANGXI RADICAL RIVER → CJK UNIFIED IDEOGRAPH-5DDB # + +2F882 ; 5DE2 ; MA # ( 巢 → 巢 ) CJK COMPATIBILITY IDEOGRAPH-2F882 → CJK UNIFIED IDEOGRAPH-5DE2 # + +30A8 ; 5DE5 ; MA # ( エ → 工 ) KATAKANA LETTER E → CJK UNIFIED IDEOGRAPH-5DE5 # →⼯→ +2F2F ; 5DE5 ; MA #* ( ⼯ → 工 ) KANGXI RADICAL WORK → CJK UNIFIED IDEOGRAPH-5DE5 # + +2F30 ; 5DF1 ; MA #* ( ⼰ → 己 ) KANGXI RADICAL ONESELF → CJK UNIFIED IDEOGRAPH-5DF1 # + +2E92 ; 5DF3 ; MA #* ( ⺒ → 巳 ) CJK RADICAL SNAKE → CJK UNIFIED IDEOGRAPH-5DF3 # + +2F883 ; 382F ; MA # ( 㠯 → 㠯 ) CJK COMPATIBILITY IDEOGRAPH-2F883 → CJK UNIFIED IDEOGRAPH-382F # + +2F884 ; 5DFD ; MA # ( 巽 → 巽 ) CJK COMPATIBILITY IDEOGRAPH-2F884 → CJK UNIFIED IDEOGRAPH-5DFD # + +2F31 ; 5DFE ; MA #* ( ⼱ → 巾 ) KANGXI RADICAL TURBAN → CJK UNIFIED IDEOGRAPH-5DFE # + +5E32 ; 5E21 ; MA # ( 帲 → 帡 ) CJK UNIFIED IDEOGRAPH-5E32 → CJK UNIFIED IDEOGRAPH-5E21 # + +2F885 ; 5E28 ; MA # ( 帨 → 帨 ) CJK COMPATIBILITY IDEOGRAPH-2F885 → CJK UNIFIED IDEOGRAPH-5E28 # + +2F886 ; 5E3D ; MA # ( 帽 → 帽 ) CJK COMPATIBILITY IDEOGRAPH-2F886 → CJK UNIFIED IDEOGRAPH-5E3D # + +2F887 ; 5E69 ; MA # ( 幩 → 幩 ) CJK COMPATIBILITY IDEOGRAPH-2F887 → CJK UNIFIED IDEOGRAPH-5E69 # + +2F888 ; 3862 ; MA # ( 㡢 → 㡢 ) CJK COMPATIBILITY IDEOGRAPH-2F888 → CJK UNIFIED IDEOGRAPH-3862 # + +2F889 ; 22183 ; MA # ( 𢆃 → 𢆃 ) CJK COMPATIBILITY IDEOGRAPH-2F889 → CJK UNIFIED IDEOGRAPH-22183 # + +2F32 ; 5E72 ; MA #* ( ⼲ → 干 ) KANGXI RADICAL DRY → CJK UNIFIED IDEOGRAPH-5E72 # + +F98E ; 5E74 ; MA # ( 年 → 年 ) CJK COMPATIBILITY IDEOGRAPH-F98E → CJK UNIFIED IDEOGRAPH-5E74 # + +2F939 ; 2219F ; MA # ( 𢆟 → 𢆟 ) CJK COMPATIBILITY IDEOGRAPH-2F939 → CJK UNIFIED IDEOGRAPH-2219F # + +2E93 ; 5E7A ; MA #* ( ⺓ → 幺 ) CJK RADICAL THREAD → CJK UNIFIED IDEOGRAPH-5E7A # +2F33 ; 5E7A ; MA #* ( ⼳ → 幺 ) KANGXI RADICAL SHORT THREAD → CJK UNIFIED IDEOGRAPH-5E7A # + +2F34 ; 5E7F ; MA #* ( ⼴ → 广 ) KANGXI RADICAL DOTTED CLIFF → CJK UNIFIED IDEOGRAPH-5E7F # + +FA01 ; 5EA6 ; MA # ( 度 → 度 ) CJK COMPATIBILITY IDEOGRAPH-FA01 → CJK UNIFIED IDEOGRAPH-5EA6 # + +2F88A ; 387C ; MA # ( 㡼 → 㡼 ) CJK COMPATIBILITY IDEOGRAPH-2F88A → CJK UNIFIED IDEOGRAPH-387C # + +2F88B ; 5EB0 ; MA # ( 庰 → 庰 ) CJK COMPATIBILITY IDEOGRAPH-2F88B → CJK UNIFIED IDEOGRAPH-5EB0 # + +2F88C ; 5EB3 ; MA # ( 庳 → 庳 ) CJK COMPATIBILITY IDEOGRAPH-2F88C → CJK UNIFIED IDEOGRAPH-5EB3 # + +2F88D ; 5EB6 ; MA # ( 庶 → 庶 ) CJK COMPATIBILITY IDEOGRAPH-2F88D → CJK UNIFIED IDEOGRAPH-5EB6 # + +F928 ; 5ECA ; MA # ( 廊 → 廊 ) CJK COMPATIBILITY IDEOGRAPH-F928 → CJK UNIFIED IDEOGRAPH-5ECA # +2F88E ; 5ECA ; MA # ( 廊 → 廊 ) CJK COMPATIBILITY IDEOGRAPH-2F88E → CJK UNIFIED IDEOGRAPH-5ECA # + +F9A2 ; 5EC9 ; MA # ( 廉 → 廉 ) CJK COMPATIBILITY IDEOGRAPH-F9A2 → CJK UNIFIED IDEOGRAPH-5EC9 # + +FA82 ; 5ED2 ; MA # ( 廒 → 廒 ) CJK COMPATIBILITY IDEOGRAPH-FA82 → CJK UNIFIED IDEOGRAPH-5ED2 # + +FA0B ; 5ED3 ; MA # ( 廓 → 廓 ) CJK COMPATIBILITY IDEOGRAPH-FA0B → CJK UNIFIED IDEOGRAPH-5ED3 # + +FA83 ; 5ED9 ; MA # ( 廙 → 廙 ) CJK COMPATIBILITY IDEOGRAPH-FA83 → CJK UNIFIED IDEOGRAPH-5ED9 # + +F982 ; 5EEC ; MA # ( 廬 → 廬 ) CJK COMPATIBILITY IDEOGRAPH-F982 → CJK UNIFIED IDEOGRAPH-5EEC # + +2F35 ; 5EF4 ; MA #* ( ⼵ → 廴 ) KANGXI RADICAL LONG STRIDE → CJK UNIFIED IDEOGRAPH-5EF4 # + +2F890 ; 5EFE ; MA # ( 廾 → 廾 ) CJK COMPATIBILITY IDEOGRAPH-2F890 → CJK UNIFIED IDEOGRAPH-5EFE # +2F36 ; 5EFE ; MA #* ( ⼶ → 廾 ) KANGXI RADICAL TWO HANDS → CJK UNIFIED IDEOGRAPH-5EFE # + +2F891 ; 22331 ; MA # ( 𢌱 → 𢌱 ) CJK COMPATIBILITY IDEOGRAPH-2F891 → CJK UNIFIED IDEOGRAPH-22331 # +2F892 ; 22331 ; MA # ( 𢌱 → 𢌱 ) CJK COMPATIBILITY IDEOGRAPH-2F892 → CJK UNIFIED IDEOGRAPH-22331 # + +F943 ; 5F04 ; MA # ( 弄 → 弄 ) CJK COMPATIBILITY IDEOGRAPH-F943 → CJK UNIFIED IDEOGRAPH-5F04 # + +2F37 ; 5F0B ; MA #* ( ⼷ → 弋 ) KANGXI RADICAL SHOOT → CJK UNIFIED IDEOGRAPH-5F0B # + +2F38 ; 5F13 ; MA #* ( ⼸ → 弓 ) KANGXI RADICAL BOW → CJK UNIFIED IDEOGRAPH-5F13 # + +2F894 ; 5F22 ; MA # ( 弢 → 弢 ) CJK COMPATIBILITY IDEOGRAPH-2F894 → CJK UNIFIED IDEOGRAPH-5F22 # +2F895 ; 5F22 ; MA # ( 弢 → 弢 ) CJK COMPATIBILITY IDEOGRAPH-2F895 → CJK UNIFIED IDEOGRAPH-5F22 # + +2F39 ; 5F50 ; MA #* ( ⼹ → 彐 ) KANGXI RADICAL SNOUT → CJK UNIFIED IDEOGRAPH-5F50 # + +2E94 ; 5F51 ; MA #* ( ⺔ → 彑 ) CJK RADICAL SNOUT ONE → CJK UNIFIED IDEOGRAPH-5F51 # + +2F874 ; 5F53 ; MA # ( 当 → 当 ) CJK COMPATIBILITY IDEOGRAPH-2F874 → CJK UNIFIED IDEOGRAPH-5F53 # + +2F896 ; 38C7 ; MA # ( 㣇 → 㣇 ) CJK COMPATIBILITY IDEOGRAPH-2F896 → CJK UNIFIED IDEOGRAPH-38C7 # + +2F3A ; 5F61 ; MA #* ( ⼺ → 彡 ) KANGXI RADICAL BRISTLE → CJK UNIFIED IDEOGRAPH-5F61 # + +2F899 ; 5F62 ; MA # ( 形 → 形 ) CJK COMPATIBILITY IDEOGRAPH-2F899 → CJK UNIFIED IDEOGRAPH-5F62 # + +FA84 ; 5F69 ; MA # ( 彩 → 彩 ) CJK COMPATIBILITY IDEOGRAPH-FA84 → CJK UNIFIED IDEOGRAPH-5F69 # + +2F89A ; 5F6B ; MA # ( 彫 → 彫 ) CJK COMPATIBILITY IDEOGRAPH-2F89A → CJK UNIFIED IDEOGRAPH-5F6B # + +2F3B ; 5F73 ; MA #* ( ⼻ → 彳 ) KANGXI RADICAL STEP → CJK UNIFIED IDEOGRAPH-5F73 # + +F9D8 ; 5F8B ; MA # ( 律 → 律 ) CJK COMPATIBILITY IDEOGRAPH-F9D8 → CJK UNIFIED IDEOGRAPH-5F8B # + +2F89B ; 38E3 ; MA # ( 㣣 → 㣣 ) CJK COMPATIBILITY IDEOGRAPH-2F89B → CJK UNIFIED IDEOGRAPH-38E3 # + +22505 ; 5F9A ; MA # ( 𢔅 → 徚 ) CJK UNIFIED IDEOGRAPH-22505 → CJK UNIFIED IDEOGRAPH-5F9A # →徚→ +2F89C ; 5F9A ; MA # ( 徚 → 徚 ) CJK COMPATIBILITY IDEOGRAPH-2F89C → CJK UNIFIED IDEOGRAPH-5F9A # + +F966 ; 5FA9 ; MA # ( 復 → 復 ) CJK COMPATIBILITY IDEOGRAPH-F966 → CJK UNIFIED IDEOGRAPH-5FA9 # + +FA85 ; 5FAD ; MA # ( 徭 → 徭 ) CJK COMPATIBILITY IDEOGRAPH-FA85 → CJK UNIFIED IDEOGRAPH-5FAD # + +2F3C ; 5FC3 ; MA #* ( ⼼ → 心 ) KANGXI RADICAL HEART → CJK UNIFIED IDEOGRAPH-5FC3 # + +2E96 ; 5FC4 ; MA #* ( ⺖ → 忄 ) CJK RADICAL HEART ONE → CJK UNIFIED IDEOGRAPH-5FC4 # + +2E97 ; 38FA ; MA #* ( ⺗ → 㣺 ) CJK RADICAL HEART TWO → CJK UNIFIED IDEOGRAPH-38FA # + +2F89D ; 5FCD ; MA # ( 忍 → 忍 ) CJK COMPATIBILITY IDEOGRAPH-2F89D → CJK UNIFIED IDEOGRAPH-5FCD # + +2F89E ; 5FD7 ; MA # ( 志 → 志 ) CJK COMPATIBILITY IDEOGRAPH-2F89E → CJK UNIFIED IDEOGRAPH-5FD7 # + +F9A3 ; 5FF5 ; MA # ( 念 → 念 ) CJK COMPATIBILITY IDEOGRAPH-F9A3 → CJK UNIFIED IDEOGRAPH-5FF5 # + +2F89F ; 5FF9 ; MA # ( 忹 → 忹 ) CJK COMPATIBILITY IDEOGRAPH-2F89F → CJK UNIFIED IDEOGRAPH-5FF9 # + +F960 ; 6012 ; MA # ( 怒 → 怒 ) CJK COMPATIBILITY IDEOGRAPH-F960 → CJK UNIFIED IDEOGRAPH-6012 # + +F9AC ; 601C ; MA # ( 怜 → 怜 ) CJK COMPATIBILITY IDEOGRAPH-F9AC → CJK UNIFIED IDEOGRAPH-601C # + +FA6B ; 6075 ; MA # ( 恵 → 恵 ) CJK COMPATIBILITY IDEOGRAPH-FA6B → CJK UNIFIED IDEOGRAPH-6075 # + +2F8A2 ; 391C ; MA # ( 㤜 → 㤜 ) CJK COMPATIBILITY IDEOGRAPH-2F8A2 → CJK UNIFIED IDEOGRAPH-391C # + +2F8A1 ; 393A ; MA # ( 㤺 → 㤺 ) CJK COMPATIBILITY IDEOGRAPH-2F8A1 → CJK UNIFIED IDEOGRAPH-393A # + +2F8A0 ; 6081 ; MA # ( 悁 → 悁 ) CJK COMPATIBILITY IDEOGRAPH-2F8A0 → CJK UNIFIED IDEOGRAPH-6081 # + +FA3D ; 6094 ; MA # ( 悔 → 悔 ) CJK COMPATIBILITY IDEOGRAPH-FA3D → CJK UNIFIED IDEOGRAPH-6094 # +2F8A3 ; 6094 ; MA # ( 悔 → 悔 ) CJK COMPATIBILITY IDEOGRAPH-2F8A3 → CJK UNIFIED IDEOGRAPH-6094 # + +2F8A5 ; 60C7 ; MA # ( 惇 → 惇 ) CJK COMPATIBILITY IDEOGRAPH-2F8A5 → CJK UNIFIED IDEOGRAPH-60C7 # + +FA86 ; 60D8 ; MA # ( 惘 → 惘 ) CJK COMPATIBILITY IDEOGRAPH-FA86 → CJK UNIFIED IDEOGRAPH-60D8 # + +F9B9 ; 60E1 ; MA # ( 惡 → 惡 ) CJK COMPATIBILITY IDEOGRAPH-F9B9 → CJK UNIFIED IDEOGRAPH-60E1 # + +2F8A4 ; 226D4 ; MA # ( 𢛔 → 𢛔 ) CJK COMPATIBILITY IDEOGRAPH-2F8A4 → CJK UNIFIED IDEOGRAPH-226D4 # + +FA88 ; 6108 ; MA # ( 愈 → 愈 ) CJK COMPATIBILITY IDEOGRAPH-FA88 → CJK UNIFIED IDEOGRAPH-6108 # + +FA3E ; 6168 ; MA # ( 慨 → 慨 ) CJK COMPATIBILITY IDEOGRAPH-FA3E → CJK UNIFIED IDEOGRAPH-6168 # + +F9D9 ; 6144 ; MA # ( 慄 → 慄 ) CJK COMPATIBILITY IDEOGRAPH-F9D9 → CJK UNIFIED IDEOGRAPH-6144 # + +2F8A6 ; 6148 ; MA # ( 慈 → 慈 ) CJK COMPATIBILITY IDEOGRAPH-2F8A6 → CJK UNIFIED IDEOGRAPH-6148 # + +2F8A7 ; 614C ; MA # ( 慌 → 慌 ) CJK COMPATIBILITY IDEOGRAPH-2F8A7 → CJK UNIFIED IDEOGRAPH-614C # +2F8A9 ; 614C ; MA # ( 慌 → 慌 ) CJK COMPATIBILITY IDEOGRAPH-2F8A9 → CJK UNIFIED IDEOGRAPH-614C # + +FA87 ; 614E ; MA # ( 慎 → 慎 ) CJK COMPATIBILITY IDEOGRAPH-FA87 → CJK UNIFIED IDEOGRAPH-614E # +2F8A8 ; 614E ; MA # ( 慎 → 慎 ) CJK COMPATIBILITY IDEOGRAPH-2F8A8 → CJK UNIFIED IDEOGRAPH-614E # + +FA8A ; 6160 ; MA # ( 慠 → 慠 ) CJK COMPATIBILITY IDEOGRAPH-FA8A → CJK UNIFIED IDEOGRAPH-6160 # + +2F8AA ; 617A ; MA # ( 慺 → 慺 ) CJK COMPATIBILITY IDEOGRAPH-2F8AA → CJK UNIFIED IDEOGRAPH-617A # + +FA3F ; 618E ; MA # ( 憎 → 憎 ) CJK COMPATIBILITY IDEOGRAPH-FA3F → CJK UNIFIED IDEOGRAPH-618E # +FA89 ; 618E ; MA # ( 憎 → 憎 ) CJK COMPATIBILITY IDEOGRAPH-FA89 → CJK UNIFIED IDEOGRAPH-618E # +2F8AB ; 618E ; MA # ( 憎 → 憎 ) CJK COMPATIBILITY IDEOGRAPH-2F8AB → CJK UNIFIED IDEOGRAPH-618E # + +F98F ; 6190 ; MA # ( 憐 → 憐 ) CJK COMPATIBILITY IDEOGRAPH-F98F → CJK UNIFIED IDEOGRAPH-6190 # + +2F8AD ; 61A4 ; MA # ( 憤 → 憤 ) CJK COMPATIBILITY IDEOGRAPH-2F8AD → CJK UNIFIED IDEOGRAPH-61A4 # + +2F8AE ; 61AF ; MA # ( 憯 → 憯 ) CJK COMPATIBILITY IDEOGRAPH-2F8AE → CJK UNIFIED IDEOGRAPH-61AF # + +2F8AC ; 61B2 ; MA # ( 憲 → 憲 ) CJK COMPATIBILITY IDEOGRAPH-2F8AC → CJK UNIFIED IDEOGRAPH-61B2 # + +FAD0 ; 22844 ; MA # ( 𢡄 → 𢡄 ) CJK COMPATIBILITY IDEOGRAPH-FAD0 → CJK UNIFIED IDEOGRAPH-22844 # + +FACF ; 2284A ; MA # ( 𢡊 → 𢡊 ) CJK COMPATIBILITY IDEOGRAPH-FACF → CJK UNIFIED IDEOGRAPH-2284A # + +2F8AF ; 61DE ; MA # ( 懞 → 懞 ) CJK COMPATIBILITY IDEOGRAPH-2F8AF → CJK UNIFIED IDEOGRAPH-61DE # + +FA40 ; 61F2 ; MA # ( 懲 → 懲 ) CJK COMPATIBILITY IDEOGRAPH-FA40 → CJK UNIFIED IDEOGRAPH-61F2 # +FA8B ; 61F2 ; MA # ( 懲 → 懲 ) CJK COMPATIBILITY IDEOGRAPH-FA8B → CJK UNIFIED IDEOGRAPH-61F2 # +2F8B0 ; 61F2 ; MA # ( 懲 → 懲 ) CJK COMPATIBILITY IDEOGRAPH-2F8B0 → CJK UNIFIED IDEOGRAPH-61F2 # + +F90D ; 61F6 ; MA # ( 懶 → 懶 ) CJK COMPATIBILITY IDEOGRAPH-F90D → CJK UNIFIED IDEOGRAPH-61F6 # +2F8B1 ; 61F6 ; MA # ( 懶 → 懶 ) CJK COMPATIBILITY IDEOGRAPH-2F8B1 → CJK UNIFIED IDEOGRAPH-61F6 # + +F990 ; 6200 ; MA # ( 戀 → 戀 ) CJK COMPATIBILITY IDEOGRAPH-F990 → CJK UNIFIED IDEOGRAPH-6200 # + +2F3D ; 6208 ; MA #* ( ⼽ → 戈 ) KANGXI RADICAL HALBERD → CJK UNIFIED IDEOGRAPH-6208 # + +2F8B2 ; 6210 ; MA # ( 成 → 成 ) CJK COMPATIBILITY IDEOGRAPH-2F8B2 → CJK UNIFIED IDEOGRAPH-6210 # + +2F8B3 ; 621B ; MA # ( 戛 → 戛 ) CJK COMPATIBILITY IDEOGRAPH-2F8B3 → CJK UNIFIED IDEOGRAPH-621B # + +F9D2 ; 622E ; MA # ( 戮 → 戮 ) CJK COMPATIBILITY IDEOGRAPH-F9D2 → CJK UNIFIED IDEOGRAPH-622E # + +FA8C ; 6234 ; MA # ( 戴 → 戴 ) CJK COMPATIBILITY IDEOGRAPH-FA8C → CJK UNIFIED IDEOGRAPH-6234 # + +2F3E ; 6236 ; MA #* ( ⼾ → 戶 ) KANGXI RADICAL DOOR → CJK UNIFIED IDEOGRAPH-6236 # +6238 ; 6236 ; MA # ( 戸 → 戶 ) CJK UNIFIED IDEOGRAPH-6238 → CJK UNIFIED IDEOGRAPH-6236 # →⼾→ + +2F3F ; 624B ; MA #* ( ⼿ → 手 ) KANGXI RADICAL HAND → CJK UNIFIED IDEOGRAPH-624B # + +2E98 ; 624C ; MA #* ( ⺘ → 扌 ) CJK RADICAL HAND → CJK UNIFIED IDEOGRAPH-624C # + +2F8B4 ; 625D ; MA # ( 扝 → 扝 ) CJK COMPATIBILITY IDEOGRAPH-2F8B4 → CJK UNIFIED IDEOGRAPH-625D # + +2F8B5 ; 62B1 ; MA # ( 抱 → 抱 ) CJK COMPATIBILITY IDEOGRAPH-2F8B5 → CJK UNIFIED IDEOGRAPH-62B1 # + +F925 ; 62C9 ; MA # ( 拉 → 拉 ) CJK COMPATIBILITY IDEOGRAPH-F925 → CJK UNIFIED IDEOGRAPH-62C9 # + +F95B ; 62CF ; MA # ( 拏 → 拏 ) CJK COMPATIBILITY IDEOGRAPH-F95B → CJK UNIFIED IDEOGRAPH-62CF # + +FA02 ; 62D3 ; MA # ( 拓 → 拓 ) CJK COMPATIBILITY IDEOGRAPH-FA02 → CJK UNIFIED IDEOGRAPH-62D3 # + +2F8B6 ; 62D4 ; MA # ( 拔 → 拔 ) CJK COMPATIBILITY IDEOGRAPH-2F8B6 → CJK UNIFIED IDEOGRAPH-62D4 # + +2F8BA ; 62FC ; MA # ( 拼 → 拼 ) CJK COMPATIBILITY IDEOGRAPH-2F8BA → CJK UNIFIED IDEOGRAPH-62FC # + +F973 ; 62FE ; MA # ( 拾 → 拾 ) CJK COMPATIBILITY IDEOGRAPH-F973 → CJK UNIFIED IDEOGRAPH-62FE # + +2F8B8 ; 22B0C ; MA # ( 𢬌 → 𢬌 ) CJK COMPATIBILITY IDEOGRAPH-2F8B8 → CJK UNIFIED IDEOGRAPH-22B0C # + +2F8B9 ; 633D ; MA # ( 挽 → 挽 ) CJK COMPATIBILITY IDEOGRAPH-2F8B9 → CJK UNIFIED IDEOGRAPH-633D # + +2F8B7 ; 6350 ; MA # ( 捐 → 捐 ) CJK COMPATIBILITY IDEOGRAPH-2F8B7 → CJK UNIFIED IDEOGRAPH-6350 # + +2F8BB ; 6368 ; MA # ( 捨 → 捨 ) CJK COMPATIBILITY IDEOGRAPH-2F8BB → CJK UNIFIED IDEOGRAPH-6368 # + +F9A4 ; 637B ; MA # ( 捻 → 捻 ) CJK COMPATIBILITY IDEOGRAPH-F9A4 → CJK UNIFIED IDEOGRAPH-637B # + +2F8BC ; 6383 ; MA # ( 掃 → 掃 ) CJK COMPATIBILITY IDEOGRAPH-2F8BC → CJK UNIFIED IDEOGRAPH-6383 # + +F975 ; 63A0 ; MA # ( 掠 → 掠 ) CJK COMPATIBILITY IDEOGRAPH-F975 → CJK UNIFIED IDEOGRAPH-63A0 # + +2F8C1 ; 63A9 ; MA # ( 掩 → 掩 ) CJK COMPATIBILITY IDEOGRAPH-2F8C1 → CJK UNIFIED IDEOGRAPH-63A9 # + +FA8D ; 63C4 ; MA # ( 揄 → 揄 ) CJK COMPATIBILITY IDEOGRAPH-FA8D → CJK UNIFIED IDEOGRAPH-63C4 # + +2F8BD ; 63E4 ; MA # ( 揤 → 揤 ) CJK COMPATIBILITY IDEOGRAPH-2F8BD → CJK UNIFIED IDEOGRAPH-63E4 # + +FA8E ; 641C ; MA # ( 搜 → 搜 ) CJK COMPATIBILITY IDEOGRAPH-FA8E → CJK UNIFIED IDEOGRAPH-641C # + +2F8BE ; 22BF1 ; MA # ( 𢯱 → 𢯱 ) CJK COMPATIBILITY IDEOGRAPH-2F8BE → CJK UNIFIED IDEOGRAPH-22BF1 # + +2F8BF ; 6422 ; MA # ( 搢 → 搢 ) CJK COMPATIBILITY IDEOGRAPH-2F8BF → CJK UNIFIED IDEOGRAPH-6422 # + +2F8C0 ; 63C5 ; MA # ( 揅 → 揅 ) CJK COMPATIBILITY IDEOGRAPH-2F8C0 → CJK UNIFIED IDEOGRAPH-63C5 # + +FA8F ; 6452 ; MA # ( 摒 → 摒 ) CJK COMPATIBILITY IDEOGRAPH-FA8F → CJK UNIFIED IDEOGRAPH-6452 # + +2F8C3 ; 6469 ; MA # ( 摩 → 摩 ) CJK COMPATIBILITY IDEOGRAPH-2F8C3 → CJK UNIFIED IDEOGRAPH-6469 # + +2F8C6 ; 6477 ; MA # ( 摷 → 摷 ) CJK COMPATIBILITY IDEOGRAPH-2F8C6 → CJK UNIFIED IDEOGRAPH-6477 # + +2F8C4 ; 647E ; MA # ( 摾 → 摾 ) CJK COMPATIBILITY IDEOGRAPH-2F8C4 → CJK UNIFIED IDEOGRAPH-647E # + +2F8C2 ; 3A2E ; MA # ( 㨮 → 㨮 ) CJK COMPATIBILITY IDEOGRAPH-2F8C2 → CJK UNIFIED IDEOGRAPH-3A2E # + +6409 ; 3A41 ; MA # ( 搉 → 㩁 ) CJK UNIFIED IDEOGRAPH-6409 → CJK UNIFIED IDEOGRAPH-3A41 # + +F991 ; 649A ; MA # ( 撚 → 撚 ) CJK COMPATIBILITY IDEOGRAPH-F991 → CJK UNIFIED IDEOGRAPH-649A # + +2F8C5 ; 649D ; MA # ( 撝 → 撝 ) CJK COMPATIBILITY IDEOGRAPH-2F8C5 → CJK UNIFIED IDEOGRAPH-649D # + +F930 ; 64C4 ; MA # ( 擄 → 擄 ) CJK COMPATIBILITY IDEOGRAPH-F930 → CJK UNIFIED IDEOGRAPH-64C4 # + +2F8C7 ; 3A6C ; MA # ( 㩬 → 㩬 ) CJK COMPATIBILITY IDEOGRAPH-2F8C7 → CJK UNIFIED IDEOGRAPH-3A6C # + +2F40 ; 652F ; MA #* ( ⽀ → 支 ) KANGXI RADICAL BRANCH → CJK UNIFIED IDEOGRAPH-652F # + +2F41 ; 6534 ; MA #* ( ⽁ → 攴 ) KANGXI RADICAL RAP → CJK UNIFIED IDEOGRAPH-6534 # + +2E99 ; 6535 ; MA #* ( ⺙ → 攵 ) CJK RADICAL RAP → CJK UNIFIED IDEOGRAPH-6535 # + +FA41 ; 654F ; MA # ( 敏 → 敏 ) CJK COMPATIBILITY IDEOGRAPH-FA41 → CJK UNIFIED IDEOGRAPH-654F # +2F8C8 ; 654F ; MA # ( 敏 → 敏 ) CJK COMPATIBILITY IDEOGRAPH-2F8C8 → CJK UNIFIED IDEOGRAPH-654F # + +FA90 ; 6556 ; MA # ( 敖 → 敖 ) CJK COMPATIBILITY IDEOGRAPH-FA90 → CJK UNIFIED IDEOGRAPH-6556 # + +2F8C9 ; 656C ; MA # ( 敬 → 敬 ) CJK COMPATIBILITY IDEOGRAPH-2F8C9 → CJK UNIFIED IDEOGRAPH-656C # + +F969 ; 6578 ; MA # ( 數 → 數 ) CJK COMPATIBILITY IDEOGRAPH-F969 → CJK UNIFIED IDEOGRAPH-6578 # + +2F8CA ; 2300A ; MA # ( 𣀊 → 𣀊 ) CJK COMPATIBILITY IDEOGRAPH-2F8CA → CJK UNIFIED IDEOGRAPH-2300A # + +2F42 ; 6587 ; MA #* ( ⽂ → 文 ) KANGXI RADICAL SCRIPT → CJK UNIFIED IDEOGRAPH-6587 # + +2EEB ; 6589 ; MA #* ( ⻫ → 斉 ) CJK RADICAL J-SIMPLIFIED EVEN → CJK UNIFIED IDEOGRAPH-6589 # + +2F43 ; 6597 ; MA #* ( ⽃ → 斗 ) KANGXI RADICAL DIPPER → CJK UNIFIED IDEOGRAPH-6597 # + +F9BE ; 6599 ; MA # ( 料 → 料 ) CJK COMPATIBILITY IDEOGRAPH-F9BE → CJK UNIFIED IDEOGRAPH-6599 # + +2F44 ; 65A4 ; MA #* ( ⽄ → 斤 ) KANGXI RADICAL AXE → CJK UNIFIED IDEOGRAPH-65A4 # + +2F45 ; 65B9 ; MA #* ( ⽅ → 方 ) KANGXI RADICAL SQUARE → CJK UNIFIED IDEOGRAPH-65B9 # + +F983 ; 65C5 ; MA # ( 旅 → 旅 ) CJK COMPATIBILITY IDEOGRAPH-F983 → CJK UNIFIED IDEOGRAPH-65C5 # + +2F46 ; 65E0 ; MA #* ( ⽆ → 无 ) KANGXI RADICAL NOT → CJK UNIFIED IDEOGRAPH-65E0 # + +2E9B ; 65E1 ; MA #* ( ⺛ → 旡 ) CJK RADICAL CHOKE → CJK UNIFIED IDEOGRAPH-65E1 # + +FA42 ; 65E2 ; MA # ( 既 → 既 ) CJK COMPATIBILITY IDEOGRAPH-FA42 → CJK UNIFIED IDEOGRAPH-65E2 # + +2F8CB ; 65E3 ; MA # ( 旣 → 旣 ) CJK COMPATIBILITY IDEOGRAPH-2F8CB → CJK UNIFIED IDEOGRAPH-65E3 # + +2F47 ; 65E5 ; MA #* ( ⽇ → 日 ) KANGXI RADICAL SUN → CJK UNIFIED IDEOGRAPH-65E5 # + +F9E0 ; 6613 ; MA # ( 易 → 易 ) CJK COMPATIBILITY IDEOGRAPH-F9E0 → CJK UNIFIED IDEOGRAPH-6613 # + +66F6 ; 3ADA ; MA # ( 曶 → 㫚 ) CJK UNIFIED IDEOGRAPH-66F6 → CJK UNIFIED IDEOGRAPH-3ADA # + +2F8D1 ; 3AE4 ; MA # ( 㫤 → 㫤 ) CJK COMPATIBILITY IDEOGRAPH-2F8D1 → CJK UNIFIED IDEOGRAPH-3AE4 # + +2F8CD ; 6649 ; MA # ( 晉 → 晉 ) CJK COMPATIBILITY IDEOGRAPH-2F8CD → CJK UNIFIED IDEOGRAPH-6649 # + +6669 ; 665A ; MA # ( 晩 → 晚 ) CJK UNIFIED IDEOGRAPH-6669 → CJK UNIFIED IDEOGRAPH-665A # + +FA12 ; 6674 ; MA # ( 晴 → 晴 ) CJK COMPATIBILITY IDEOGRAPH-FA12 → CJK UNIFIED IDEOGRAPH-6674 # +FA91 ; 6674 ; MA # ( 晴 → 晴 ) CJK COMPATIBILITY IDEOGRAPH-FA91 → CJK UNIFIED IDEOGRAPH-6674 # + +FA43 ; 6691 ; MA # ( 暑 → 暑 ) CJK COMPATIBILITY IDEOGRAPH-FA43 → CJK UNIFIED IDEOGRAPH-6691 # +2F8CF ; 6691 ; MA # ( 暑 → 暑 ) CJK COMPATIBILITY IDEOGRAPH-2F8CF → CJK UNIFIED IDEOGRAPH-6691 # + +F9C5 ; 6688 ; MA # ( 暈 → 暈 ) CJK COMPATIBILITY IDEOGRAPH-F9C5 → CJK UNIFIED IDEOGRAPH-6688 # + +2F8D0 ; 3B08 ; MA # ( 㬈 → 㬈 ) CJK COMPATIBILITY IDEOGRAPH-2F8D0 → CJK UNIFIED IDEOGRAPH-3B08 # + +2F8D5 ; 669C ; MA # ( 暜 → 暜 ) CJK COMPATIBILITY IDEOGRAPH-2F8D5 → CJK UNIFIED IDEOGRAPH-669C # + +FA06 ; 66B4 ; MA # ( 暴 → 暴 ) CJK COMPATIBILITY IDEOGRAPH-FA06 → CJK UNIFIED IDEOGRAPH-66B4 # + +F98B ; 66C6 ; MA # ( 曆 → 曆 ) CJK COMPATIBILITY IDEOGRAPH-F98B → CJK UNIFIED IDEOGRAPH-66C6 # + +2F8CE ; 3B19 ; MA # ( 㬙 → 㬙 ) CJK COMPATIBILITY IDEOGRAPH-2F8CE → CJK UNIFIED IDEOGRAPH-3B19 # + +2F897 ; 232B8 ; MA # ( 𣊸 → 𣊸 ) CJK COMPATIBILITY IDEOGRAPH-2F897 → CJK UNIFIED IDEOGRAPH-232B8 # + +2F48 ; 66F0 ; MA #* ( ⽈ → 曰 ) KANGXI RADICAL SAY → CJK UNIFIED IDEOGRAPH-66F0 # + +F901 ; 66F4 ; MA # ( 更 → 更 ) CJK COMPATIBILITY IDEOGRAPH-F901 → CJK UNIFIED IDEOGRAPH-66F4 # + +2F8CC ; 66F8 ; MA # ( 書 → 書 ) CJK COMPATIBILITY IDEOGRAPH-2F8CC → CJK UNIFIED IDEOGRAPH-66F8 # + +2F49 ; 6708 ; MA #* ( ⽉ → 月 ) KANGXI RADICAL MOON → CJK UNIFIED IDEOGRAPH-6708 # + +2F980 ; 2335F ; MA # ( 𣍟 → 𣍟 ) CJK COMPATIBILITY IDEOGRAPH-2F980 → CJK UNIFIED IDEOGRAPH-2335F # +2B73E ; 2335F ; MA # ( 𫜾 → 𣍟 ) CJK UNIFIED IDEOGRAPH-2B73E → CJK UNIFIED IDEOGRAPH-2335F # + +80A6 ; 670C ; MA # ( 肦 → 朌 ) CJK UNIFIED IDEOGRAPH-80A6 → CJK UNIFIED IDEOGRAPH-670C # + +80D0 ; 670F ; MA # ( 胐 → 朏 ) CJK UNIFIED IDEOGRAPH-80D0 → CJK UNIFIED IDEOGRAPH-670F # + +80CA ; 6710 ; MA # ( 胊 → 朐 ) CJK UNIFIED IDEOGRAPH-80CA → CJK UNIFIED IDEOGRAPH-6710 # + +8101 ; 6713 ; MA # ( 脁 → 朓 ) CJK UNIFIED IDEOGRAPH-8101 → CJK UNIFIED IDEOGRAPH-6713 # + +80F6 ; 3B35 ; MA # ( 胶 → 㬵 ) CJK UNIFIED IDEOGRAPH-80F6 → CJK UNIFIED IDEOGRAPH-3B35 # + +F929 ; 6717 ; MA # ( 朗 → 朗 ) CJK COMPATIBILITY IDEOGRAPH-F929 → CJK UNIFIED IDEOGRAPH-6717 # +FA92 ; 6717 ; MA # ( 朗 → 朗 ) CJK COMPATIBILITY IDEOGRAPH-FA92 → CJK UNIFIED IDEOGRAPH-6717 # +2F8D8 ; 6717 ; MA # ( 朗 → 朗 ) CJK COMPATIBILITY IDEOGRAPH-2F8D8 → CJK UNIFIED IDEOGRAPH-6717 # + +8127 ; 6718 ; MA # ( 脧 → 朘 ) CJK UNIFIED IDEOGRAPH-8127 → CJK UNIFIED IDEOGRAPH-6718 # + +FA93 ; 671B ; MA # ( 望 → 望 ) CJK COMPATIBILITY IDEOGRAPH-FA93 → CJK UNIFIED IDEOGRAPH-671B # +2F8D9 ; 671B ; MA # ( 望 → 望 ) CJK COMPATIBILITY IDEOGRAPH-2F8D9 → CJK UNIFIED IDEOGRAPH-671B # + +5E50 ; 3B3A ; MA # ( 幐 → 㬺 ) CJK UNIFIED IDEOGRAPH-5E50 → CJK UNIFIED IDEOGRAPH-3B3A # + +4420 ; 3B3B ; MA # ( 䐠 → 㬻 ) CJK UNIFIED IDEOGRAPH-4420 → CJK UNIFIED IDEOGRAPH-3B3B # + +2F989 ; 23393 ; MA # ( 𣎓 → 𣎓 ) CJK COMPATIBILITY IDEOGRAPH-2F989 → CJK UNIFIED IDEOGRAPH-23393 # + +81A7 ; 6723 ; MA # ( 膧 → 朣 ) CJK UNIFIED IDEOGRAPH-81A7 → CJK UNIFIED IDEOGRAPH-6723 # + +2F98A ; 2339C ; MA # ( 𣎜 → 𣎜 ) CJK COMPATIBILITY IDEOGRAPH-2F98A → CJK UNIFIED IDEOGRAPH-2339C # + +2F4A ; 6728 ; MA #* ( ⽊ → 木 ) KANGXI RADICAL TREE → CJK UNIFIED IDEOGRAPH-6728 # + +F9E1 ; 674E ; MA # ( 李 → 李 ) CJK COMPATIBILITY IDEOGRAPH-F9E1 → CJK UNIFIED IDEOGRAPH-674E # + +2F8DC ; 6753 ; MA # ( 杓 → 杓 ) CJK COMPATIBILITY IDEOGRAPH-2F8DC → CJK UNIFIED IDEOGRAPH-6753 # + +FA94 ; 6756 ; MA # ( 杖 → 杖 ) CJK COMPATIBILITY IDEOGRAPH-FA94 → CJK UNIFIED IDEOGRAPH-6756 # + +2F8DB ; 675E ; MA # ( 杞 → 杞 ) CJK COMPATIBILITY IDEOGRAPH-2F8DB → CJK UNIFIED IDEOGRAPH-675E # + +2F8DD ; 233C3 ; MA # ( 𣏃 → 𣏃 ) CJK COMPATIBILITY IDEOGRAPH-2F8DD → CJK UNIFIED IDEOGRAPH-233C3 # + +67FF ; 676E ; MA # ( 柿 → 杮 ) CJK UNIFIED IDEOGRAPH-67FF → CJK UNIFIED IDEOGRAPH-676E # + +F9C8 ; 677B ; MA # ( 杻 → 杻 ) CJK COMPATIBILITY IDEOGRAPH-F9C8 → CJK UNIFIED IDEOGRAPH-677B # + +2F8E0 ; 6785 ; MA # ( 枅 → 枅 ) CJK COMPATIBILITY IDEOGRAPH-2F8E0 → CJK UNIFIED IDEOGRAPH-6785 # + +F9F4 ; 6797 ; MA # ( 林 → 林 ) CJK COMPATIBILITY IDEOGRAPH-F9F4 → CJK UNIFIED IDEOGRAPH-6797 # + +2F8DE ; 3B49 ; MA # ( 㭉 → 㭉 ) CJK COMPATIBILITY IDEOGRAPH-2F8DE → CJK UNIFIED IDEOGRAPH-3B49 # + +FAD1 ; 233D5 ; MA # ( 𣏕 → 𣏕 ) CJK COMPATIBILITY IDEOGRAPH-FAD1 → CJK UNIFIED IDEOGRAPH-233D5 # + +F9C9 ; 67F3 ; MA # ( 柳 → 柳 ) CJK COMPATIBILITY IDEOGRAPH-F9C9 → CJK UNIFIED IDEOGRAPH-67F3 # + +2F8DF ; 67FA ; MA # ( 柺 → 柺 ) CJK COMPATIBILITY IDEOGRAPH-2F8DF → CJK UNIFIED IDEOGRAPH-67FA # + +F9DA ; 6817 ; MA # ( 栗 → 栗 ) CJK COMPATIBILITY IDEOGRAPH-F9DA → CJK UNIFIED IDEOGRAPH-6817 # + +2F8E5 ; 681F ; MA # ( 栟 → 栟 ) CJK COMPATIBILITY IDEOGRAPH-2F8E5 → CJK UNIFIED IDEOGRAPH-681F # + +2F8E1 ; 6852 ; MA # ( 桒 → 桒 ) CJK COMPATIBILITY IDEOGRAPH-2F8E1 → CJK UNIFIED IDEOGRAPH-6852 # + +2F8E3 ; 2346D ; MA # ( 𣑭 → 𣑭 ) CJK COMPATIBILITY IDEOGRAPH-2F8E3 → CJK UNIFIED IDEOGRAPH-2346D # + +F97A ; 6881 ; MA # ( 梁 → 梁 ) CJK COMPATIBILITY IDEOGRAPH-F97A → CJK UNIFIED IDEOGRAPH-6881 # + +FA44 ; 6885 ; MA # ( 梅 → 梅 ) CJK COMPATIBILITY IDEOGRAPH-FA44 → CJK UNIFIED IDEOGRAPH-6885 # +2F8E2 ; 6885 ; MA # ( 梅 → 梅 ) CJK COMPATIBILITY IDEOGRAPH-2F8E2 → CJK UNIFIED IDEOGRAPH-6885 # + +2F8E4 ; 688E ; MA # ( 梎 → 梎 ) CJK COMPATIBILITY IDEOGRAPH-2F8E4 → CJK UNIFIED IDEOGRAPH-688E # + +F9E2 ; 68A8 ; MA # ( 梨 → 梨 ) CJK COMPATIBILITY IDEOGRAPH-F9E2 → CJK UNIFIED IDEOGRAPH-68A8 # + +2F8E6 ; 6914 ; MA # ( 椔 → 椔 ) CJK COMPATIBILITY IDEOGRAPH-2F8E6 → CJK UNIFIED IDEOGRAPH-6914 # + +2F8E8 ; 6942 ; MA # ( 楂 → 楂 ) CJK COMPATIBILITY IDEOGRAPH-2F8E8 → CJK UNIFIED IDEOGRAPH-6942 # + +FAD2 ; 3B9D ; MA # ( 㮝 → 㮝 ) CJK COMPATIBILITY IDEOGRAPH-FAD2 → CJK UNIFIED IDEOGRAPH-3B9D # +2F8E7 ; 3B9D ; MA # ( 㮝 → 㮝 ) CJK COMPATIBILITY IDEOGRAPH-2F8E7 → CJK UNIFIED IDEOGRAPH-3B9D # + +69E9 ; 3BA3 ; MA # ( 槩 → 㮣 ) CJK UNIFIED IDEOGRAPH-69E9 → CJK UNIFIED IDEOGRAPH-3BA3 # + +6A27 ; 699D ; MA # ( 樧 → 榝 ) CJK UNIFIED IDEOGRAPH-6A27 → CJK UNIFIED IDEOGRAPH-699D # + +2F8E9 ; 69A3 ; MA # ( 榣 → 榣 ) CJK COMPATIBILITY IDEOGRAPH-2F8E9 → CJK UNIFIED IDEOGRAPH-69A3 # + +2F8EA ; 69EA ; MA # ( 槪 → 槪 ) CJK COMPATIBILITY IDEOGRAPH-2F8EA → CJK UNIFIED IDEOGRAPH-69EA # + +F914 ; 6A02 ; MA # ( 樂 → 樂 ) CJK COMPATIBILITY IDEOGRAPH-F914 → CJK UNIFIED IDEOGRAPH-6A02 # +F95C ; 6A02 ; MA # ( 樂 → 樂 ) CJK COMPATIBILITY IDEOGRAPH-F95C → CJK UNIFIED IDEOGRAPH-6A02 # +F9BF ; 6A02 ; MA # ( 樂 → 樂 ) CJK COMPATIBILITY IDEOGRAPH-F9BF → CJK UNIFIED IDEOGRAPH-6A02 # + +F94C ; 6A13 ; MA # ( 樓 → 樓 ) CJK COMPATIBILITY IDEOGRAPH-F94C → CJK UNIFIED IDEOGRAPH-6A13 # + +2F8EC ; 236A3 ; MA # ( 𣚣 → 𣚣 ) CJK COMPATIBILITY IDEOGRAPH-2F8EC → CJK UNIFIED IDEOGRAPH-236A3 # + +2F8EB ; 6AA8 ; MA # ( 檨 → 檨 ) CJK COMPATIBILITY IDEOGRAPH-2F8EB → CJK UNIFIED IDEOGRAPH-6AA8 # + +F931 ; 6AD3 ; MA # ( 櫓 → 櫓 ) CJK COMPATIBILITY IDEOGRAPH-F931 → CJK UNIFIED IDEOGRAPH-6AD3 # + +2F8ED ; 6ADB ; MA # ( 櫛 → 櫛 ) CJK COMPATIBILITY IDEOGRAPH-2F8ED → CJK UNIFIED IDEOGRAPH-6ADB # + +F91D ; 6B04 ; MA # ( 欄 → 欄 ) CJK COMPATIBILITY IDEOGRAPH-F91D → CJK UNIFIED IDEOGRAPH-6B04 # + +2F8EE ; 3C18 ; MA # ( 㰘 → 㰘 ) CJK COMPATIBILITY IDEOGRAPH-2F8EE → CJK UNIFIED IDEOGRAPH-3C18 # + +2F4B ; 6B20 ; MA #* ( ⽋ → 欠 ) KANGXI RADICAL LACK → CJK UNIFIED IDEOGRAPH-6B20 # + +2F8EF ; 6B21 ; MA # ( 次 → 次 ) CJK COMPATIBILITY IDEOGRAPH-2F8EF → CJK UNIFIED IDEOGRAPH-6B21 # + +2F8F0 ; 238A7 ; MA # ( 𣢧 → 𣢧 ) CJK COMPATIBILITY IDEOGRAPH-2F8F0 → CJK UNIFIED IDEOGRAPH-238A7 # + +2F8F1 ; 6B54 ; MA # ( 歔 → 歔 ) CJK COMPATIBILITY IDEOGRAPH-2F8F1 → CJK UNIFIED IDEOGRAPH-6B54 # + +2F8F2 ; 3C4E ; MA # ( 㱎 → 㱎 ) CJK COMPATIBILITY IDEOGRAPH-2F8F2 → CJK UNIFIED IDEOGRAPH-3C4E # + +2F4C ; 6B62 ; MA #* ( ⽌ → 止 ) KANGXI RADICAL STOP → CJK UNIFIED IDEOGRAPH-6B62 # + +2EED ; 6B6F ; MA #* ( ⻭ → 歯 ) CJK RADICAL J-SIMPLIFIED TOOTH → CJK UNIFIED IDEOGRAPH-6B6F # + +2F8F3 ; 6B72 ; MA # ( 歲 → 歲 ) CJK COMPATIBILITY IDEOGRAPH-2F8F3 → CJK UNIFIED IDEOGRAPH-6B72 # + +F98C ; 6B77 ; MA # ( 歷 → 歷 ) CJK COMPATIBILITY IDEOGRAPH-F98C → CJK UNIFIED IDEOGRAPH-6B77 # + +FA95 ; 6B79 ; MA # ( 歹 → 歹 ) CJK COMPATIBILITY IDEOGRAPH-FA95 → CJK UNIFIED IDEOGRAPH-6B79 # +2F4D ; 6B79 ; MA #* ( ⽍ → 歹 ) KANGXI RADICAL DEATH → CJK UNIFIED IDEOGRAPH-6B79 # + +2E9E ; 6B7A ; MA #* ( ⺞ → 歺 ) CJK RADICAL DEATH → CJK UNIFIED IDEOGRAPH-6B7A # + +2F8F4 ; 6B9F ; MA # ( 殟 → 殟 ) CJK COMPATIBILITY IDEOGRAPH-2F8F4 → CJK UNIFIED IDEOGRAPH-6B9F # + +F9A5 ; 6BAE ; MA # ( 殮 → 殮 ) CJK COMPATIBILITY IDEOGRAPH-F9A5 → CJK UNIFIED IDEOGRAPH-6BAE # + +2F4E ; 6BB3 ; MA #* ( ⽎ → 殳 ) KANGXI RADICAL WEAPON → CJK UNIFIED IDEOGRAPH-6BB3 # + +F970 ; 6BBA ; MA # ( 殺 → 殺 ) CJK COMPATIBILITY IDEOGRAPH-F970 → CJK UNIFIED IDEOGRAPH-6BBA # +FA96 ; 6BBA ; MA # ( 殺 → 殺 ) CJK COMPATIBILITY IDEOGRAPH-FA96 → CJK UNIFIED IDEOGRAPH-6BBA # +2F8F5 ; 6BBA ; MA # ( 殺 → 殺 ) CJK COMPATIBILITY IDEOGRAPH-2F8F5 → CJK UNIFIED IDEOGRAPH-6BBA # + +2F8F6 ; 6BBB ; MA # ( 殻 → 殻 ) CJK COMPATIBILITY IDEOGRAPH-2F8F6 → CJK UNIFIED IDEOGRAPH-6BBB # + +2F8F7 ; 23A8D ; MA # ( 𣪍 → 𣪍 ) CJK COMPATIBILITY IDEOGRAPH-2F8F7 → CJK UNIFIED IDEOGRAPH-23A8D # + +2F4F ; 6BCB ; MA #* ( ⽏ → 毋 ) KANGXI RADICAL DO NOT → CJK UNIFIED IDEOGRAPH-6BCB # + +2E9F ; 6BCD ; MA #* ( ⺟ → 母 ) CJK RADICAL MOTHER → CJK UNIFIED IDEOGRAPH-6BCD # + +2F8F9 ; 23AFA ; MA # ( 𣫺 → 𣫺 ) CJK COMPATIBILITY IDEOGRAPH-2F8F9 → CJK UNIFIED IDEOGRAPH-23AFA # + +2F50 ; 6BD4 ; MA #* ( ⽐ → 比 ) KANGXI RADICAL COMPARE → CJK UNIFIED IDEOGRAPH-6BD4 # + +2F51 ; 6BDB ; MA #* ( ⽑ → 毛 ) KANGXI RADICAL FUR → CJK UNIFIED IDEOGRAPH-6BDB # + +2F52 ; 6C0F ; MA #* ( ⽒ → 氏 ) KANGXI RADICAL CLAN → CJK UNIFIED IDEOGRAPH-6C0F # + +2EA0 ; 6C11 ; MA #* ( ⺠ → 民 ) CJK RADICAL CIVILIAN → CJK UNIFIED IDEOGRAPH-6C11 # + +2F53 ; 6C14 ; MA #* ( ⽓ → 气 ) KANGXI RADICAL STEAM → CJK UNIFIED IDEOGRAPH-6C14 # + +2F54 ; 6C34 ; MA #* ( ⽔ → 水 ) KANGXI RADICAL WATER → CJK UNIFIED IDEOGRAPH-6C34 # + +2EA1 ; 6C35 ; MA #* ( ⺡ → 氵 ) CJK RADICAL WATER ONE → CJK UNIFIED IDEOGRAPH-6C35 # + +2EA2 ; 6C3A ; MA #* ( ⺢ → 氺 ) CJK RADICAL WATER TWO → CJK UNIFIED IDEOGRAPH-6C3A # + +2F8FA ; 6C4E ; MA # ( 汎 → 汎 ) CJK COMPATIBILITY IDEOGRAPH-2F8FA → CJK UNIFIED IDEOGRAPH-6C4E # + +2F8FE ; 6C67 ; MA # ( 汧 → 汧 ) CJK COMPATIBILITY IDEOGRAPH-2F8FE → CJK UNIFIED IDEOGRAPH-6C67 # + +F972 ; 6C88 ; MA # ( 沈 → 沈 ) CJK COMPATIBILITY IDEOGRAPH-F972 → CJK UNIFIED IDEOGRAPH-6C88 # + +2F8FC ; 6CBF ; MA # ( 沿 → 沿 ) CJK COMPATIBILITY IDEOGRAPH-2F8FC → CJK UNIFIED IDEOGRAPH-6CBF # + +F968 ; 6CCC ; MA # ( 泌 → 泌 ) CJK COMPATIBILITY IDEOGRAPH-F968 → CJK UNIFIED IDEOGRAPH-6CCC # + +2F8FD ; 6CCD ; MA # ( 泍 → 泍 ) CJK COMPATIBILITY IDEOGRAPH-2F8FD → CJK UNIFIED IDEOGRAPH-6CCD # + +F9E3 ; 6CE5 ; MA # ( 泥 → 泥 ) CJK COMPATIBILITY IDEOGRAPH-F9E3 → CJK UNIFIED IDEOGRAPH-6CE5 # + +2F8FB ; 23CBC ; MA # ( 𣲼 → 𣲼 ) CJK COMPATIBILITY IDEOGRAPH-2F8FB → CJK UNIFIED IDEOGRAPH-23CBC # + +F915 ; 6D1B ; MA # ( 洛 → 洛 ) CJK COMPATIBILITY IDEOGRAPH-F915 → CJK UNIFIED IDEOGRAPH-6D1B # + +FA05 ; 6D1E ; MA # ( 洞 → 洞 ) CJK COMPATIBILITY IDEOGRAPH-FA05 → CJK UNIFIED IDEOGRAPH-6D1E # + +2F907 ; 6D34 ; MA # ( 洴 → 洴 ) CJK COMPATIBILITY IDEOGRAPH-2F907 → CJK UNIFIED IDEOGRAPH-6D34 # + +2F900 ; 6D3E ; MA # ( 派 → 派 ) CJK COMPATIBILITY IDEOGRAPH-2F900 → CJK UNIFIED IDEOGRAPH-6D3E # + +F9CA ; 6D41 ; MA # ( 流 → 流 ) CJK COMPATIBILITY IDEOGRAPH-F9CA → CJK UNIFIED IDEOGRAPH-6D41 # +FA97 ; 6D41 ; MA # ( 流 → 流 ) CJK COMPATIBILITY IDEOGRAPH-FA97 → CJK UNIFIED IDEOGRAPH-6D41 # +2F902 ; 6D41 ; MA # ( 流 → 流 ) CJK COMPATIBILITY IDEOGRAPH-2F902 → CJK UNIFIED IDEOGRAPH-6D41 # + +2F8FF ; 6D16 ; MA # ( 洖 → 洖 ) CJK COMPATIBILITY IDEOGRAPH-2F8FF → CJK UNIFIED IDEOGRAPH-6D16 # + +2F903 ; 6D69 ; MA # ( 浩 → 浩 ) CJK COMPATIBILITY IDEOGRAPH-2F903 → CJK UNIFIED IDEOGRAPH-6D69 # + +F92A ; 6D6A ; MA # ( 浪 → 浪 ) CJK COMPATIBILITY IDEOGRAPH-F92A → CJK UNIFIED IDEOGRAPH-6D6A # + +FA45 ; 6D77 ; MA # ( 海 → 海 ) CJK COMPATIBILITY IDEOGRAPH-FA45 → CJK UNIFIED IDEOGRAPH-6D77 # +2F901 ; 6D77 ; MA # ( 海 → 海 ) CJK COMPATIBILITY IDEOGRAPH-2F901 → CJK UNIFIED IDEOGRAPH-6D77 # + +2F904 ; 6D78 ; MA # ( 浸 → 浸 ) CJK COMPATIBILITY IDEOGRAPH-2F904 → CJK UNIFIED IDEOGRAPH-6D78 # + +2F905 ; 6D85 ; MA # ( 涅 → 涅 ) CJK COMPATIBILITY IDEOGRAPH-2F905 → CJK UNIFIED IDEOGRAPH-6D85 # +23D40 ; 6D85 ; MA # ( 𣵀 → 涅 ) CJK UNIFIED IDEOGRAPH-23D40 → CJK UNIFIED IDEOGRAPH-6D85 # →涅→ + +2F906 ; 23D1E ; MA # ( 𣴞 → 𣴞 ) CJK COMPATIBILITY IDEOGRAPH-2F906 → CJK UNIFIED IDEOGRAPH-23D1E # + +F9F5 ; 6DCB ; MA # ( 淋 → 淋 ) CJK COMPATIBILITY IDEOGRAPH-F9F5 → CJK UNIFIED IDEOGRAPH-6DCB # + +F94D ; 6DDA ; MA # ( 淚 → 淚 ) CJK COMPATIBILITY IDEOGRAPH-F94D → CJK UNIFIED IDEOGRAPH-6DDA # + +F9D6 ; 6DEA ; MA # ( 淪 → 淪 ) CJK COMPATIBILITY IDEOGRAPH-F9D6 → CJK UNIFIED IDEOGRAPH-6DEA # + +2F90E ; 6DF9 ; MA # ( 淹 → 淹 ) CJK COMPATIBILITY IDEOGRAPH-2F90E → CJK UNIFIED IDEOGRAPH-6DF9 # + +FA46 ; 6E1A ; MA # ( 渚 → 渚 ) CJK COMPATIBILITY IDEOGRAPH-FA46 → CJK UNIFIED IDEOGRAPH-6E1A # + +2F908 ; 6E2F ; MA # ( 港 → 港 ) CJK COMPATIBILITY IDEOGRAPH-2F908 → CJK UNIFIED IDEOGRAPH-6E2F # + +2F909 ; 6E6E ; MA # ( 湮 → 湮 ) CJK COMPATIBILITY IDEOGRAPH-2F909 → CJK UNIFIED IDEOGRAPH-6E6E # + +6F59 ; 6E88 ; MA # ( 潙 → 溈 ) CJK UNIFIED IDEOGRAPH-6F59 → CJK UNIFIED IDEOGRAPH-6E88 # + +FA99 ; 6ECB ; MA # ( 滋 → 滋 ) CJK COMPATIBILITY IDEOGRAPH-FA99 → CJK UNIFIED IDEOGRAPH-6ECB # +2F90B ; 6ECB ; MA # ( 滋 → 滋 ) CJK COMPATIBILITY IDEOGRAPH-2F90B → CJK UNIFIED IDEOGRAPH-6ECB # + +F9CB ; 6E9C ; MA # ( 溜 → 溜 ) CJK COMPATIBILITY IDEOGRAPH-F9CB → CJK UNIFIED IDEOGRAPH-6E9C # + +F9EC ; 6EBA ; MA # ( 溺 → 溺 ) CJK COMPATIBILITY IDEOGRAPH-F9EC → CJK UNIFIED IDEOGRAPH-6EBA # + +2F90C ; 6EC7 ; MA # ( 滇 → 滇 ) CJK COMPATIBILITY IDEOGRAPH-2F90C → CJK UNIFIED IDEOGRAPH-6EC7 # + +F904 ; 6ED1 ; MA # ( 滑 → 滑 ) CJK COMPATIBILITY IDEOGRAPH-F904 → CJK UNIFIED IDEOGRAPH-6ED1 # + +FA98 ; 6EDB ; MA # ( 滛 → 滛 ) CJK COMPATIBILITY IDEOGRAPH-FA98 → CJK UNIFIED IDEOGRAPH-6EDB # + +2F90A ; 3D33 ; MA # ( 㴳 → 㴳 ) CJK COMPATIBILITY IDEOGRAPH-2F90A → CJK UNIFIED IDEOGRAPH-3D33 # + +F94E ; 6F0F ; MA # ( 漏 → 漏 ) CJK COMPATIBILITY IDEOGRAPH-F94E → CJK UNIFIED IDEOGRAPH-6F0F # + +FA47 ; 6F22 ; MA # ( 漢 → 漢 ) CJK COMPATIBILITY IDEOGRAPH-FA47 → CJK UNIFIED IDEOGRAPH-6F22 # +FA9A ; 6F22 ; MA # ( 漢 → 漢 ) CJK COMPATIBILITY IDEOGRAPH-FA9A → CJK UNIFIED IDEOGRAPH-6F22 # + +F992 ; 6F23 ; MA # ( 漣 → 漣 ) CJK COMPATIBILITY IDEOGRAPH-F992 → CJK UNIFIED IDEOGRAPH-6F23 # + +2F90D ; 23ED1 ; MA # ( 𣻑 → 𣻑 ) CJK COMPATIBILITY IDEOGRAPH-2F90D → CJK UNIFIED IDEOGRAPH-23ED1 # + +2F90F ; 6F6E ; MA # ( 潮 → 潮 ) CJK COMPATIBILITY IDEOGRAPH-2F90F → CJK UNIFIED IDEOGRAPH-6F6E # + +2F910 ; 23F5E ; MA # ( 𣽞 → 𣽞 ) CJK COMPATIBILITY IDEOGRAPH-2F910 → CJK UNIFIED IDEOGRAPH-23F5E # + +2F911 ; 23F8E ; MA # ( 𣾎 → 𣾎 ) CJK COMPATIBILITY IDEOGRAPH-2F911 → CJK UNIFIED IDEOGRAPH-23F8E # + +2F912 ; 6FC6 ; MA # ( 濆 → 濆 ) CJK COMPATIBILITY IDEOGRAPH-2F912 → CJK UNIFIED IDEOGRAPH-6FC6 # + +F922 ; 6FEB ; MA # ( 濫 → 濫 ) CJK COMPATIBILITY IDEOGRAPH-F922 → CJK UNIFIED IDEOGRAPH-6FEB # + +F984 ; 6FFE ; MA # ( 濾 → 濾 ) CJK COMPATIBILITY IDEOGRAPH-F984 → CJK UNIFIED IDEOGRAPH-6FFE # + +2F915 ; 701B ; MA # ( 瀛 → 瀛 ) CJK COMPATIBILITY IDEOGRAPH-2F915 → CJK UNIFIED IDEOGRAPH-701B # + +FA9B ; 701E ; MA # ( 瀞 → 瀞 ) CJK COMPATIBILITY IDEOGRAPH-FA9B → CJK UNIFIED IDEOGRAPH-701E # +2F914 ; 701E ; MA # ( 瀞 → 瀞 ) CJK COMPATIBILITY IDEOGRAPH-2F914 → CJK UNIFIED IDEOGRAPH-701E # + +2F913 ; 7039 ; MA # ( 瀹 → 瀹 ) CJK COMPATIBILITY IDEOGRAPH-2F913 → CJK UNIFIED IDEOGRAPH-7039 # + +2F917 ; 704A ; MA # ( 灊 → 灊 ) CJK COMPATIBILITY IDEOGRAPH-2F917 → CJK UNIFIED IDEOGRAPH-704A # + +2F916 ; 3D96 ; MA # ( 㶖 → 㶖 ) CJK COMPATIBILITY IDEOGRAPH-2F916 → CJK UNIFIED IDEOGRAPH-3D96 # + +2F55 ; 706B ; MA #* ( ⽕ → 火 ) KANGXI RADICAL FIRE → CJK UNIFIED IDEOGRAPH-706B # + +2EA3 ; 706C ; MA #* ( ⺣ → 灬 ) CJK RADICAL FIRE → CJK UNIFIED IDEOGRAPH-706C # + +2F835 ; 7070 ; MA # ( 灰 → 灰 ) CJK COMPATIBILITY IDEOGRAPH-2F835 → CJK UNIFIED IDEOGRAPH-7070 # + +2F919 ; 7077 ; MA # ( 灷 → 灷 ) CJK COMPATIBILITY IDEOGRAPH-2F919 → CJK UNIFIED IDEOGRAPH-7077 # + +2F918 ; 707D ; MA # ( 災 → 災 ) CJK COMPATIBILITY IDEOGRAPH-2F918 → CJK UNIFIED IDEOGRAPH-707D # + +F9FB ; 7099 ; MA # ( 炙 → 炙 ) CJK COMPATIBILITY IDEOGRAPH-F9FB → CJK UNIFIED IDEOGRAPH-7099 # + +2F91A ; 70AD ; MA # ( 炭 → 炭 ) CJK COMPATIBILITY IDEOGRAPH-2F91A → CJK UNIFIED IDEOGRAPH-70AD # + +F99F ; 70C8 ; MA # ( 烈 → 烈 ) CJK COMPATIBILITY IDEOGRAPH-F99F → CJK UNIFIED IDEOGRAPH-70C8 # + +F916 ; 70D9 ; MA # ( 烙 → 烙 ) CJK COMPATIBILITY IDEOGRAPH-F916 → CJK UNIFIED IDEOGRAPH-70D9 # + +FA48 ; 716E ; MA # ( 煮 → 煮 ) CJK COMPATIBILITY IDEOGRAPH-FA48 → CJK UNIFIED IDEOGRAPH-716E # +FA9C ; 716E ; MA # ( 煮 → 煮 ) CJK COMPATIBILITY IDEOGRAPH-FA9C → CJK UNIFIED IDEOGRAPH-716E # + +2F91D ; 24263 ; MA # ( 𤉣 → 𤉣 ) CJK COMPATIBILITY IDEOGRAPH-2F91D → CJK UNIFIED IDEOGRAPH-24263 # + +2F91C ; 7145 ; MA # ( 煅 → 煅 ) CJK COMPATIBILITY IDEOGRAPH-2F91C → CJK UNIFIED IDEOGRAPH-7145 # + +F993 ; 7149 ; MA # ( 煉 → 煉 ) CJK COMPATIBILITY IDEOGRAPH-F993 → CJK UNIFIED IDEOGRAPH-7149 # + +FA6C ; 242EE ; MA # ( 𤋮 → 𤋮 ) CJK COMPATIBILITY IDEOGRAPH-FA6C → CJK UNIFIED IDEOGRAPH-242EE # + +2F91E ; 719C ; MA # ( 熜 → 熜 ) CJK COMPATIBILITY IDEOGRAPH-2F91E → CJK UNIFIED IDEOGRAPH-719C # + +F9C0 ; 71CE ; MA # ( 燎 → 燎 ) CJK COMPATIBILITY IDEOGRAPH-F9C0 → CJK UNIFIED IDEOGRAPH-71CE # + +F9EE ; 71D0 ; MA # ( 燐 → 燐 ) CJK COMPATIBILITY IDEOGRAPH-F9EE → CJK UNIFIED IDEOGRAPH-71D0 # + +2F91F ; 243AB ; MA # ( 𤎫 → 𤎫 ) CJK COMPATIBILITY IDEOGRAPH-2F91F → CJK UNIFIED IDEOGRAPH-243AB # + +F932 ; 7210 ; MA # ( 爐 → 爐 ) CJK COMPATIBILITY IDEOGRAPH-F932 → CJK UNIFIED IDEOGRAPH-7210 # + +F91E ; 721B ; MA # ( 爛 → 爛 ) CJK COMPATIBILITY IDEOGRAPH-F91E → CJK UNIFIED IDEOGRAPH-721B # + +2F920 ; 7228 ; MA # ( 爨 → 爨 ) CJK COMPATIBILITY IDEOGRAPH-2F920 → CJK UNIFIED IDEOGRAPH-7228 # + +2F56 ; 722A ; MA #* ( ⽖ → 爪 ) KANGXI RADICAL CLAW → CJK UNIFIED IDEOGRAPH-722A # + +FA49 ; 722B ; MA # ( 爫 → 爫 ) CJK COMPATIBILITY IDEOGRAPH-FA49 → CJK UNIFIED IDEOGRAPH-722B # +2EA4 ; 722B ; MA #* ( ⺤ → 爫 ) CJK RADICAL PAW ONE → CJK UNIFIED IDEOGRAPH-722B # + +FA9E ; 7235 ; MA # ( 爵 → 爵 ) CJK COMPATIBILITY IDEOGRAPH-FA9E → CJK UNIFIED IDEOGRAPH-7235 # +2F921 ; 7235 ; MA # ( 爵 → 爵 ) CJK COMPATIBILITY IDEOGRAPH-2F921 → CJK UNIFIED IDEOGRAPH-7235 # + +2F57 ; 7236 ; MA #* ( ⽗ → 父 ) KANGXI RADICAL FATHER → CJK UNIFIED IDEOGRAPH-7236 # + +2F58 ; 723B ; MA #* ( ⽘ → 爻 ) KANGXI RADICAL DOUBLE X → CJK UNIFIED IDEOGRAPH-723B # + +2EA6 ; 4E2C ; MA #* ( ⺦ → 丬 ) CJK RADICAL SIMPLIFIED HALF TREE TRUNK → CJK UNIFIED IDEOGRAPH-4E2C # + +2F5A ; 7247 ; MA #* ( ⽚ → 片 ) KANGXI RADICAL SLICE → CJK UNIFIED IDEOGRAPH-7247 # + +2F922 ; 7250 ; MA # ( 牐 → 牐 ) CJK COMPATIBILITY IDEOGRAPH-2F922 → CJK UNIFIED IDEOGRAPH-7250 # + +2F5B ; 7259 ; MA #* ( ⽛ → 牙 ) KANGXI RADICAL FANG → CJK UNIFIED IDEOGRAPH-7259 # + +2F923 ; 24608 ; MA # ( 𤘈 → 𤘈 ) CJK COMPATIBILITY IDEOGRAPH-2F923 → CJK UNIFIED IDEOGRAPH-24608 # + +2F5C ; 725B ; MA #* ( ⽜ → 牛 ) KANGXI RADICAL COW → CJK UNIFIED IDEOGRAPH-725B # + +F946 ; 7262 ; MA # ( 牢 → 牢 ) CJK COMPATIBILITY IDEOGRAPH-F946 → CJK UNIFIED IDEOGRAPH-7262 # + +2F924 ; 7280 ; MA # ( 犀 → 犀 ) CJK COMPATIBILITY IDEOGRAPH-2F924 → CJK UNIFIED IDEOGRAPH-7280 # + +2F925 ; 7295 ; MA # ( 犕 → 犕 ) CJK COMPATIBILITY IDEOGRAPH-2F925 → CJK UNIFIED IDEOGRAPH-7295 # + +2F5D ; 72AC ; MA #* ( ⽝ → 犬 ) KANGXI RADICAL DOG → CJK UNIFIED IDEOGRAPH-72AC # + +2EA8 ; 72AD ; MA #* ( ⺨ → 犭 ) CJK RADICAL DOG → CJK UNIFIED IDEOGRAPH-72AD # + +FA9F ; 72AF ; MA # ( 犯 → 犯 ) CJK COMPATIBILITY IDEOGRAPH-FA9F → CJK UNIFIED IDEOGRAPH-72AF # + +F9FA ; 72C0 ; MA # ( 狀 → 狀 ) CJK COMPATIBILITY IDEOGRAPH-F9FA → CJK UNIFIED IDEOGRAPH-72C0 # + +2F926 ; 24735 ; MA # ( 𤜵 → 𤜵 ) CJK COMPATIBILITY IDEOGRAPH-2F926 → CJK UNIFIED IDEOGRAPH-24735 # + +F92B ; 72FC ; MA # ( 狼 → 狼 ) CJK COMPATIBILITY IDEOGRAPH-F92B → CJK UNIFIED IDEOGRAPH-72FC # + +FA16 ; 732A ; MA # ( 猪 → 猪 ) CJK COMPATIBILITY IDEOGRAPH-FA16 → CJK UNIFIED IDEOGRAPH-732A # +FAA0 ; 732A ; MA # ( 猪 → 猪 ) CJK COMPATIBILITY IDEOGRAPH-FAA0 → CJK UNIFIED IDEOGRAPH-732A # + +2AEC5 ; 24814 ; MA # ( 𪻅 → 𤠔 ) CJK UNIFIED IDEOGRAPH-2AEC5 → CJK UNIFIED IDEOGRAPH-24814 # →𤠔→ +2F927 ; 24814 ; MA # ( 𤠔 → 𤠔 ) CJK COMPATIBILITY IDEOGRAPH-2F927 → CJK UNIFIED IDEOGRAPH-24814 # + +F9A7 ; 7375 ; MA # ( 獵 → 獵 ) CJK COMPATIBILITY IDEOGRAPH-F9A7 → CJK UNIFIED IDEOGRAPH-7375 # + +2F928 ; 737A ; MA # ( 獺 → 獺 ) CJK COMPATIBILITY IDEOGRAPH-2F928 → CJK UNIFIED IDEOGRAPH-737A # + +2F5E ; 7384 ; MA #* ( ⽞ → 玄 ) KANGXI RADICAL PROFOUND → CJK UNIFIED IDEOGRAPH-7384 # + +F961 ; 7387 ; MA # ( 率 → 率 ) CJK COMPATIBILITY IDEOGRAPH-F961 → CJK UNIFIED IDEOGRAPH-7387 # +F9DB ; 7387 ; MA # ( 率 → 率 ) CJK COMPATIBILITY IDEOGRAPH-F9DB → CJK UNIFIED IDEOGRAPH-7387 # + +2F5F ; 7389 ; MA #* ( ⽟ → 玉 ) KANGXI RADICAL JADE → CJK UNIFIED IDEOGRAPH-7389 # + +2F929 ; 738B ; MA # ( 王 → 王 ) CJK COMPATIBILITY IDEOGRAPH-2F929 → CJK UNIFIED IDEOGRAPH-738B # + +2F92A ; 3EAC ; MA # ( 㺬 → 㺬 ) CJK COMPATIBILITY IDEOGRAPH-2F92A → CJK UNIFIED IDEOGRAPH-3EAC # + +2F92B ; 73A5 ; MA # ( 玥 → 玥 ) CJK COMPATIBILITY IDEOGRAPH-2F92B → CJK UNIFIED IDEOGRAPH-73A5 # +248FD ; 73A5 ; MA # ( 𤣽 → 玥 ) CJK UNIFIED IDEOGRAPH-248FD → CJK UNIFIED IDEOGRAPH-73A5 # →玥→ + +F9AD ; 73B2 ; MA # ( 玲 → 玲 ) CJK COMPATIBILITY IDEOGRAPH-F9AD → CJK UNIFIED IDEOGRAPH-73B2 # + +2F92C ; 3EB8 ; MA # ( 㺸 → 㺸 ) CJK COMPATIBILITY IDEOGRAPH-2F92C → CJK UNIFIED IDEOGRAPH-3EB8 # +2F92D ; 3EB8 ; MA # ( 㺸 → 㺸 ) CJK COMPATIBILITY IDEOGRAPH-2F92D → CJK UNIFIED IDEOGRAPH-3EB8 # + +F917 ; 73DE ; MA # ( 珞 → 珞 ) CJK COMPATIBILITY IDEOGRAPH-F917 → CJK UNIFIED IDEOGRAPH-73DE # + +F9CC ; 7409 ; MA # ( 琉 → 琉 ) CJK COMPATIBILITY IDEOGRAPH-F9CC → CJK UNIFIED IDEOGRAPH-7409 # + +F9E4 ; 7406 ; MA # ( 理 → 理 ) CJK COMPATIBILITY IDEOGRAPH-F9E4 → CJK UNIFIED IDEOGRAPH-7406 # + +FA4A ; 7422 ; MA # ( 琢 → 琢 ) CJK COMPATIBILITY IDEOGRAPH-FA4A → CJK UNIFIED IDEOGRAPH-7422 # + +2F92E ; 7447 ; MA # ( 瑇 → 瑇 ) CJK COMPATIBILITY IDEOGRAPH-2F92E → CJK UNIFIED IDEOGRAPH-7447 # + +2F92F ; 745C ; MA # ( 瑜 → 瑜 ) CJK COMPATIBILITY IDEOGRAPH-2F92F → CJK UNIFIED IDEOGRAPH-745C # + +F9AE ; 7469 ; MA # ( 瑩 → 瑩 ) CJK COMPATIBILITY IDEOGRAPH-F9AE → CJK UNIFIED IDEOGRAPH-7469 # + +FAA1 ; 7471 ; MA # ( 瑱 → 瑱 ) CJK COMPATIBILITY IDEOGRAPH-FAA1 → CJK UNIFIED IDEOGRAPH-7471 # +2F930 ; 7471 ; MA # ( 瑱 → 瑱 ) CJK COMPATIBILITY IDEOGRAPH-2F930 → CJK UNIFIED IDEOGRAPH-7471 # + +2F931 ; 7485 ; MA # ( 璅 → 璅 ) CJK COMPATIBILITY IDEOGRAPH-2F931 → CJK UNIFIED IDEOGRAPH-7485 # + +F994 ; 7489 ; MA # ( 璉 → 璉 ) CJK COMPATIBILITY IDEOGRAPH-F994 → CJK UNIFIED IDEOGRAPH-7489 # + +F9EF ; 7498 ; MA # ( 璘 → 璘 ) CJK COMPATIBILITY IDEOGRAPH-F9EF → CJK UNIFIED IDEOGRAPH-7498 # + +2F932 ; 74CA ; MA # ( 瓊 → 瓊 ) CJK COMPATIBILITY IDEOGRAPH-2F932 → CJK UNIFIED IDEOGRAPH-74CA # + +2F60 ; 74DC ; MA #* ( ⽠ → 瓜 ) KANGXI RADICAL MELON → CJK UNIFIED IDEOGRAPH-74DC # + +2F61 ; 74E6 ; MA #* ( ⽡ → 瓦 ) KANGXI RADICAL TILE → CJK UNIFIED IDEOGRAPH-74E6 # + +2F933 ; 3F1B ; MA # ( 㼛 → 㼛 ) CJK COMPATIBILITY IDEOGRAPH-2F933 → CJK UNIFIED IDEOGRAPH-3F1B # + +FAA2 ; 7506 ; MA # ( 甆 → 甆 ) CJK COMPATIBILITY IDEOGRAPH-FAA2 → CJK UNIFIED IDEOGRAPH-7506 # + +2F62 ; 7518 ; MA #* ( ⽢ → 甘 ) KANGXI RADICAL SWEET → CJK UNIFIED IDEOGRAPH-7518 # + +2F63 ; 751F ; MA #* ( ⽣ → 生 ) KANGXI RADICAL LIFE → CJK UNIFIED IDEOGRAPH-751F # + +2F934 ; 7524 ; MA # ( 甤 → 甤 ) CJK COMPATIBILITY IDEOGRAPH-2F934 → CJK UNIFIED IDEOGRAPH-7524 # + +2F64 ; 7528 ; MA #* ( ⽤ → 用 ) KANGXI RADICAL USE → CJK UNIFIED IDEOGRAPH-7528 # + +2F65 ; 7530 ; MA #* ( ⽥ → 田 ) KANGXI RADICAL FIELD → CJK UNIFIED IDEOGRAPH-7530 # + +FAA3 ; 753B ; MA # ( 画 → 画 ) CJK COMPATIBILITY IDEOGRAPH-FAA3 → CJK UNIFIED IDEOGRAPH-753B # + +2F936 ; 753E ; MA # ( 甾 → 甾 ) CJK COMPATIBILITY IDEOGRAPH-2F936 → CJK UNIFIED IDEOGRAPH-753E # + +2F935 ; 24C36 ; MA # ( 𤰶 → 𤰶 ) CJK COMPATIBILITY IDEOGRAPH-2F935 → CJK UNIFIED IDEOGRAPH-24C36 # + +F9CD ; 7559 ; MA # ( 留 → 留 ) CJK COMPATIBILITY IDEOGRAPH-F9CD → CJK UNIFIED IDEOGRAPH-7559 # + +F976 ; 7565 ; MA # ( 略 → 略 ) CJK COMPATIBILITY IDEOGRAPH-F976 → CJK UNIFIED IDEOGRAPH-7565 # + +F962 ; 7570 ; MA # ( 異 → 異 ) CJK COMPATIBILITY IDEOGRAPH-F962 → CJK UNIFIED IDEOGRAPH-7570 # +2F938 ; 7570 ; MA # ( 異 → 異 ) CJK COMPATIBILITY IDEOGRAPH-2F938 → CJK UNIFIED IDEOGRAPH-7570 # + +2F937 ; 24C92 ; MA # ( 𤲒 → 𤲒 ) CJK COMPATIBILITY IDEOGRAPH-2F937 → CJK UNIFIED IDEOGRAPH-24C92 # + +2F66 ; 758B ; MA #* ( ⽦ → 疋 ) KANGXI RADICAL BOLT OF CLOTH → CJK UNIFIED IDEOGRAPH-758B # + +2F67 ; 7592 ; MA #* ( ⽧ → 疒 ) KANGXI RADICAL SICKNESS → CJK UNIFIED IDEOGRAPH-7592 # + +F9E5 ; 75E2 ; MA # ( 痢 → 痢 ) CJK COMPATIBILITY IDEOGRAPH-F9E5 → CJK UNIFIED IDEOGRAPH-75E2 # + +2F93A ; 7610 ; MA # ( 瘐 → 瘐 ) CJK COMPATIBILITY IDEOGRAPH-2F93A → CJK UNIFIED IDEOGRAPH-7610 # + +FAA5 ; 761F ; MA # ( 瘟 → 瘟 ) CJK COMPATIBILITY IDEOGRAPH-FAA5 → CJK UNIFIED IDEOGRAPH-761F # + +FAA4 ; 761D ; MA # ( 瘝 → 瘝 ) CJK COMPATIBILITY IDEOGRAPH-FAA4 → CJK UNIFIED IDEOGRAPH-761D # + +F9C1 ; 7642 ; MA # ( 療 → 療 ) CJK COMPATIBILITY IDEOGRAPH-F9C1 → CJK UNIFIED IDEOGRAPH-7642 # + +F90E ; 7669 ; MA # ( 癩 → 癩 ) CJK COMPATIBILITY IDEOGRAPH-F90E → CJK UNIFIED IDEOGRAPH-7669 # + +2F68 ; 7676 ; MA #* ( ⽨ → 癶 ) KANGXI RADICAL DOTTED TENT → CJK UNIFIED IDEOGRAPH-7676 # + +2F69 ; 767D ; MA #* ( ⽩ → 白 ) KANGXI RADICAL WHITE → CJK UNIFIED IDEOGRAPH-767D # + +2F93B ; 24FA1 ; MA # ( 𤾡 → 𤾡 ) CJK COMPATIBILITY IDEOGRAPH-2F93B → CJK UNIFIED IDEOGRAPH-24FA1 # + +2F93C ; 24FB8 ; MA # ( 𤾸 → 𤾸 ) CJK COMPATIBILITY IDEOGRAPH-2F93C → CJK UNIFIED IDEOGRAPH-24FB8 # + +2F6A ; 76AE ; MA #* ( ⽪ → 皮 ) KANGXI RADICAL SKIN → CJK UNIFIED IDEOGRAPH-76AE # + +2F6B ; 76BF ; MA #* ( ⽫ → 皿 ) KANGXI RADICAL DISH → CJK UNIFIED IDEOGRAPH-76BF # + +2F93D ; 25044 ; MA # ( 𥁄 → 𥁄 ) CJK COMPATIBILITY IDEOGRAPH-2F93D → CJK UNIFIED IDEOGRAPH-25044 # + +2F93E ; 3FFC ; MA # ( 㿼 → 㿼 ) CJK COMPATIBILITY IDEOGRAPH-2F93E → CJK UNIFIED IDEOGRAPH-3FFC # + +FA17 ; 76CA ; MA # ( 益 → 益 ) CJK COMPATIBILITY IDEOGRAPH-FA17 → CJK UNIFIED IDEOGRAPH-76CA # +FAA6 ; 76CA ; MA # ( 益 → 益 ) CJK COMPATIBILITY IDEOGRAPH-FAA6 → CJK UNIFIED IDEOGRAPH-76CA # + +FAA7 ; 76DB ; MA # ( 盛 → 盛 ) CJK COMPATIBILITY IDEOGRAPH-FAA7 → CJK UNIFIED IDEOGRAPH-76DB # + +F933 ; 76E7 ; MA # ( 盧 → 盧 ) CJK COMPATIBILITY IDEOGRAPH-F933 → CJK UNIFIED IDEOGRAPH-76E7 # + +2F93F ; 4008 ; MA # ( 䀈 → 䀈 ) CJK COMPATIBILITY IDEOGRAPH-2F93F → CJK UNIFIED IDEOGRAPH-4008 # + +2F6C ; 76EE ; MA #* ( ⽬ → 目 ) KANGXI RADICAL EYE → CJK UNIFIED IDEOGRAPH-76EE # + +FAA8 ; 76F4 ; MA # ( 直 → 直 ) CJK COMPATIBILITY IDEOGRAPH-FAA8 → CJK UNIFIED IDEOGRAPH-76F4 # +2F940 ; 76F4 ; MA # ( 直 → 直 ) CJK COMPATIBILITY IDEOGRAPH-2F940 → CJK UNIFIED IDEOGRAPH-76F4 # + +2F942 ; 250F2 ; MA # ( 𥃲 → 𥃲 ) CJK COMPATIBILITY IDEOGRAPH-2F942 → CJK UNIFIED IDEOGRAPH-250F2 # + +2F941 ; 250F3 ; MA # ( 𥃳 → 𥃳 ) CJK COMPATIBILITY IDEOGRAPH-2F941 → CJK UNIFIED IDEOGRAPH-250F3 # + +F96D ; 7701 ; MA # ( 省 → 省 ) CJK COMPATIBILITY IDEOGRAPH-F96D → CJK UNIFIED IDEOGRAPH-7701 # + +FAD3 ; 4018 ; MA # ( 䀘 → 䀘 ) CJK COMPATIBILITY IDEOGRAPH-FAD3 → CJK UNIFIED IDEOGRAPH-4018 # + +2F943 ; 25119 ; MA # ( 𥄙 → 𥄙 ) CJK COMPATIBILITY IDEOGRAPH-2F943 → CJK UNIFIED IDEOGRAPH-25119 # +2511A ; 25119 ; MA # ( 𥄚 → 𥄙 ) CJK UNIFIED IDEOGRAPH-2511A → CJK UNIFIED IDEOGRAPH-25119 # →𥄙→ + +2F945 ; 771E ; MA # ( 眞 → 眞 ) CJK COMPATIBILITY IDEOGRAPH-2F945 → CJK UNIFIED IDEOGRAPH-771E # + +2F946 ; 771F ; MA # ( 真 → 真 ) CJK COMPATIBILITY IDEOGRAPH-2F946 → CJK UNIFIED IDEOGRAPH-771F # +2F947 ; 771F ; MA # ( 真 → 真 ) CJK COMPATIBILITY IDEOGRAPH-2F947 → CJK UNIFIED IDEOGRAPH-771F # + +2F944 ; 25133 ; MA # ( 𥄳 → 𥄳 ) CJK COMPATIBILITY IDEOGRAPH-2F944 → CJK UNIFIED IDEOGRAPH-25133 # + +FAAA ; 7740 ; MA # ( 着 → 着 ) CJK COMPATIBILITY IDEOGRAPH-FAAA → CJK UNIFIED IDEOGRAPH-7740 # + +FAA9 ; 774A ; MA # ( 睊 → 睊 ) CJK COMPATIBILITY IDEOGRAPH-FAA9 → CJK UNIFIED IDEOGRAPH-774A # +2F948 ; 774A ; MA # ( 睊 → 睊 ) CJK COMPATIBILITY IDEOGRAPH-2F948 → CJK UNIFIED IDEOGRAPH-774A # + +9FC3 ; 4039 ; MA # ( 鿃 → 䀹 ) CJK UNIFIED IDEOGRAPH-9FC3 → CJK UNIFIED IDEOGRAPH-4039 # →䀹→ +FAD4 ; 4039 ; MA # ( 䀹 → 䀹 ) CJK COMPATIBILITY IDEOGRAPH-FAD4 → CJK UNIFIED IDEOGRAPH-4039 # +2F949 ; 4039 ; MA # ( 䀹 → 䀹 ) CJK COMPATIBILITY IDEOGRAPH-2F949 → CJK UNIFIED IDEOGRAPH-4039 # + +6663 ; 403F ; MA # ( 晣 → 䀿 ) CJK UNIFIED IDEOGRAPH-6663 → CJK UNIFIED IDEOGRAPH-403F # + +2F94B ; 4046 ; MA # ( 䁆 → 䁆 ) CJK COMPATIBILITY IDEOGRAPH-2F94B → CJK UNIFIED IDEOGRAPH-4046 # + +2F94A ; 778B ; MA # ( 瞋 → 瞋 ) CJK COMPATIBILITY IDEOGRAPH-2F94A → CJK UNIFIED IDEOGRAPH-778B # + +FAD5 ; 25249 ; MA # ( 𥉉 → 𥉉 ) CJK COMPATIBILITY IDEOGRAPH-FAD5 → CJK UNIFIED IDEOGRAPH-25249 # + +FA9D ; 77A7 ; MA # ( 瞧 → 瞧 ) CJK COMPATIBILITY IDEOGRAPH-FA9D → CJK UNIFIED IDEOGRAPH-77A7 # + +2F6D ; 77DB ; MA #* ( ⽭ → 矛 ) KANGXI RADICAL SPEAR → CJK UNIFIED IDEOGRAPH-77DB # + +2F6E ; 77E2 ; MA #* ( ⽮ → 矢 ) KANGXI RADICAL ARROW → CJK UNIFIED IDEOGRAPH-77E2 # + +2F6F ; 77F3 ; MA #* ( ⽯ → 石 ) KANGXI RADICAL STONE → CJK UNIFIED IDEOGRAPH-77F3 # + +2F94C ; 4096 ; MA # ( 䂖 → 䂖 ) CJK COMPATIBILITY IDEOGRAPH-2F94C → CJK UNIFIED IDEOGRAPH-4096 # + +2F94D ; 2541D ; MA # ( 𥐝 → 𥐝 ) CJK COMPATIBILITY IDEOGRAPH-2F94D → CJK UNIFIED IDEOGRAPH-2541D # + +784F ; 7814 ; MA # ( 硏 → 研 ) CJK UNIFIED IDEOGRAPH-784F → CJK UNIFIED IDEOGRAPH-7814 # + +2F94E ; 784E ; MA # ( 硎 → 硎 ) CJK COMPATIBILITY IDEOGRAPH-2F94E → CJK UNIFIED IDEOGRAPH-784E # + +F9CE ; 786B ; MA # ( 硫 → 硫 ) CJK COMPATIBILITY IDEOGRAPH-F9CE → CJK UNIFIED IDEOGRAPH-786B # + +F93B ; 788C ; MA # ( 碌 → 碌 ) CJK COMPATIBILITY IDEOGRAPH-F93B → CJK UNIFIED IDEOGRAPH-788C # +2F94F ; 788C ; MA # ( 碌 → 碌 ) CJK COMPATIBILITY IDEOGRAPH-2F94F → CJK UNIFIED IDEOGRAPH-788C # + +FA4B ; 7891 ; MA # ( 碑 → 碑 ) CJK COMPATIBILITY IDEOGRAPH-FA4B → CJK UNIFIED IDEOGRAPH-7891 # + +F947 ; 78CA ; MA # ( 磊 → 磊 ) CJK COMPATIBILITY IDEOGRAPH-F947 → CJK UNIFIED IDEOGRAPH-78CA # + +FAAB ; 78CC ; MA # ( 磌 → 磌 ) CJK COMPATIBILITY IDEOGRAPH-FAAB → CJK UNIFIED IDEOGRAPH-78CC # +2F950 ; 78CC ; MA # ( 磌 → 磌 ) CJK COMPATIBILITY IDEOGRAPH-2F950 → CJK UNIFIED IDEOGRAPH-78CC # + +F964 ; 78FB ; MA # ( 磻 → 磻 ) CJK COMPATIBILITY IDEOGRAPH-F964 → CJK UNIFIED IDEOGRAPH-78FB # + +2F951 ; 40E3 ; MA # ( 䃣 → 䃣 ) CJK COMPATIBILITY IDEOGRAPH-2F951 → CJK UNIFIED IDEOGRAPH-40E3 # + +F985 ; 792A ; MA # ( 礪 → 礪 ) CJK COMPATIBILITY IDEOGRAPH-F985 → CJK UNIFIED IDEOGRAPH-792A # + +2F70 ; 793A ; MA #* ( ⽰ → 示 ) KANGXI RADICAL SPIRIT → CJK UNIFIED IDEOGRAPH-793A # + +2EAD ; 793B ; MA #* ( ⺭ → 礻 ) CJK RADICAL SPIRIT TWO → CJK UNIFIED IDEOGRAPH-793B # + +FA18 ; 793C ; MA # ( 礼 → 礼 ) CJK COMPATIBILITY IDEOGRAPH-FA18 → CJK UNIFIED IDEOGRAPH-793C # + +FA4C ; 793E ; MA # ( 社 → 社 ) CJK COMPATIBILITY IDEOGRAPH-FA4C → CJK UNIFIED IDEOGRAPH-793E # + +FA4E ; 7948 ; MA # ( 祈 → 祈 ) CJK COMPATIBILITY IDEOGRAPH-FA4E → CJK UNIFIED IDEOGRAPH-7948 # + +FA4D ; 7949 ; MA # ( 祉 → 祉 ) CJK COMPATIBILITY IDEOGRAPH-FA4D → CJK UNIFIED IDEOGRAPH-7949 # + +2F952 ; 25626 ; MA # ( 𥘦 → 𥘦 ) CJK COMPATIBILITY IDEOGRAPH-2F952 → CJK UNIFIED IDEOGRAPH-25626 # + +FA4F ; 7950 ; MA # ( 祐 → 祐 ) CJK COMPATIBILITY IDEOGRAPH-FA4F → CJK UNIFIED IDEOGRAPH-7950 # + +FA50 ; 7956 ; MA # ( 祖 → 祖 ) CJK COMPATIBILITY IDEOGRAPH-FA50 → CJK UNIFIED IDEOGRAPH-7956 # +2F953 ; 7956 ; MA # ( 祖 → 祖 ) CJK COMPATIBILITY IDEOGRAPH-2F953 → CJK UNIFIED IDEOGRAPH-7956 # + +FA51 ; 795D ; MA # ( 祝 → 祝 ) CJK COMPATIBILITY IDEOGRAPH-FA51 → CJK UNIFIED IDEOGRAPH-795D # + +FA19 ; 795E ; MA # ( 神 → 神 ) CJK COMPATIBILITY IDEOGRAPH-FA19 → CJK UNIFIED IDEOGRAPH-795E # + +FA1A ; 7965 ; MA # ( 祥 → 祥 ) CJK COMPATIBILITY IDEOGRAPH-FA1A → CJK UNIFIED IDEOGRAPH-7965 # + +FA61 ; 8996 ; MA # ( 視 → 視 ) CJK COMPATIBILITY IDEOGRAPH-FA61 → CJK UNIFIED IDEOGRAPH-8996 # +FAB8 ; 8996 ; MA # ( 視 → 視 ) CJK COMPATIBILITY IDEOGRAPH-FAB8 → CJK UNIFIED IDEOGRAPH-8996 # + +F93C ; 797F ; MA # ( 祿 → 祿 ) CJK COMPATIBILITY IDEOGRAPH-F93C → CJK UNIFIED IDEOGRAPH-797F # + +2F954 ; 2569A ; MA # ( 𥚚 → 𥚚 ) CJK COMPATIBILITY IDEOGRAPH-2F954 → CJK UNIFIED IDEOGRAPH-2569A # + +FA52 ; 798D ; MA # ( 禍 → 禍 ) CJK COMPATIBILITY IDEOGRAPH-FA52 → CJK UNIFIED IDEOGRAPH-798D # + +FA53 ; 798E ; MA # ( 禎 → 禎 ) CJK COMPATIBILITY IDEOGRAPH-FA53 → CJK UNIFIED IDEOGRAPH-798E # + +FA1B ; 798F ; MA # ( 福 → 福 ) CJK COMPATIBILITY IDEOGRAPH-FA1B → CJK UNIFIED IDEOGRAPH-798F # +2F956 ; 798F ; MA # ( 福 → 福 ) CJK COMPATIBILITY IDEOGRAPH-2F956 → CJK UNIFIED IDEOGRAPH-798F # + +2F955 ; 256C5 ; MA # ( 𥛅 → 𥛅 ) CJK COMPATIBILITY IDEOGRAPH-2F955 → CJK UNIFIED IDEOGRAPH-256C5 # + +F9B6 ; 79AE ; MA # ( 禮 → 禮 ) CJK COMPATIBILITY IDEOGRAPH-F9B6 → CJK UNIFIED IDEOGRAPH-79AE # + +2F71 ; 79B8 ; MA #* ( ⽱ → 禸 ) KANGXI RADICAL TRACK → CJK UNIFIED IDEOGRAPH-79B8 # + +2F72 ; 79BE ; MA #* ( ⽲ → 禾 ) KANGXI RADICAL GRAIN → CJK UNIFIED IDEOGRAPH-79BE # + +F995 ; 79CA ; MA # ( 秊 → 秊 ) CJK COMPATIBILITY IDEOGRAPH-F995 → CJK UNIFIED IDEOGRAPH-79CA # + +2F958 ; 412F ; MA # ( 䄯 → 䄯 ) CJK COMPATIBILITY IDEOGRAPH-2F958 → CJK UNIFIED IDEOGRAPH-412F # + +2F957 ; 79EB ; MA # ( 秫 → 秫 ) CJK COMPATIBILITY IDEOGRAPH-2F957 → CJK UNIFIED IDEOGRAPH-79EB # + +F956 ; 7A1C ; MA # ( 稜 → 稜 ) CJK COMPATIBILITY IDEOGRAPH-F956 → CJK UNIFIED IDEOGRAPH-7A1C # + +2F95A ; 7A4A ; MA # ( 穊 → 穊 ) CJK COMPATIBILITY IDEOGRAPH-2F95A → CJK UNIFIED IDEOGRAPH-7A4A # + +FA54 ; 7A40 ; MA # ( 穀 → 穀 ) CJK COMPATIBILITY IDEOGRAPH-FA54 → CJK UNIFIED IDEOGRAPH-7A40 # +2F959 ; 7A40 ; MA # ( 穀 → 穀 ) CJK COMPATIBILITY IDEOGRAPH-2F959 → CJK UNIFIED IDEOGRAPH-7A40 # + +2F95B ; 7A4F ; MA # ( 穏 → 穏 ) CJK COMPATIBILITY IDEOGRAPH-2F95B → CJK UNIFIED IDEOGRAPH-7A4F # + +2F73 ; 7A74 ; MA #* ( ⽳ → 穴 ) KANGXI RADICAL CAVE → CJK UNIFIED IDEOGRAPH-7A74 # + +FA55 ; 7A81 ; MA # ( 突 → 突 ) CJK COMPATIBILITY IDEOGRAPH-FA55 → CJK UNIFIED IDEOGRAPH-7A81 # + +2F95C ; 2597C ; MA # ( 𥥼 → 𥥼 ) CJK COMPATIBILITY IDEOGRAPH-2F95C → CJK UNIFIED IDEOGRAPH-2597C # + +FAAC ; 7AB1 ; MA # ( 窱 → 窱 ) CJK COMPATIBILITY IDEOGRAPH-FAAC → CJK UNIFIED IDEOGRAPH-7AB1 # + +F9F7 ; 7ACB ; MA # ( 立 → 立 ) CJK COMPATIBILITY IDEOGRAPH-F9F7 → CJK UNIFIED IDEOGRAPH-7ACB # +2F74 ; 7ACB ; MA #* ( ⽴ → 立 ) KANGXI RADICAL STAND → CJK UNIFIED IDEOGRAPH-7ACB # + +2EEF ; 7ADC ; MA #* ( ⻯ → 竜 ) CJK RADICAL J-SIMPLIFIED DRAGON → CJK UNIFIED IDEOGRAPH-7ADC # + +2F95D ; 25AA7 ; MA # ( 𥪧 → 𥪧 ) CJK COMPATIBILITY IDEOGRAPH-2F95D → CJK UNIFIED IDEOGRAPH-25AA7 # +2F95E ; 25AA7 ; MA # ( 𥪧 → 𥪧 ) CJK COMPATIBILITY IDEOGRAPH-2F95E → CJK UNIFIED IDEOGRAPH-25AA7 # + +2F95F ; 7AEE ; MA # ( 竮 → 竮 ) CJK COMPATIBILITY IDEOGRAPH-2F95F → CJK UNIFIED IDEOGRAPH-7AEE # + +2F75 ; 7AF9 ; MA #* ( ⽵ → 竹 ) KANGXI RADICAL BAMBOO → CJK UNIFIED IDEOGRAPH-7AF9 # + +F9F8 ; 7B20 ; MA # ( 笠 → 笠 ) CJK COMPATIBILITY IDEOGRAPH-F9F8 → CJK UNIFIED IDEOGRAPH-7B20 # + +FA56 ; 7BC0 ; MA # ( 節 → 節 ) CJK COMPATIBILITY IDEOGRAPH-FA56 → CJK UNIFIED IDEOGRAPH-7BC0 # +FAAD ; 7BC0 ; MA # ( 節 → 節 ) CJK COMPATIBILITY IDEOGRAPH-FAAD → CJK UNIFIED IDEOGRAPH-7BC0 # + +2F960 ; 4202 ; MA # ( 䈂 → 䈂 ) CJK COMPATIBILITY IDEOGRAPH-2F960 → CJK UNIFIED IDEOGRAPH-4202 # + +2F961 ; 25BAB ; MA # ( 𥮫 → 𥮫 ) CJK COMPATIBILITY IDEOGRAPH-2F961 → CJK UNIFIED IDEOGRAPH-25BAB # + +2F962 ; 7BC6 ; MA # ( 篆 → 篆 ) CJK COMPATIBILITY IDEOGRAPH-2F962 → CJK UNIFIED IDEOGRAPH-7BC6 # + +2F964 ; 4227 ; MA # ( 䈧 → 䈧 ) CJK COMPATIBILITY IDEOGRAPH-2F964 → CJK UNIFIED IDEOGRAPH-4227 # + +2F963 ; 7BC9 ; MA # ( 築 → 築 ) CJK COMPATIBILITY IDEOGRAPH-2F963 → CJK UNIFIED IDEOGRAPH-7BC9 # + +2F965 ; 25C80 ; MA # ( 𥲀 → 𥲀 ) CJK COMPATIBILITY IDEOGRAPH-2F965 → CJK UNIFIED IDEOGRAPH-25C80 # + +FAD6 ; 25CD0 ; MA # ( 𥳐 → 𥳐 ) CJK COMPATIBILITY IDEOGRAPH-FAD6 → CJK UNIFIED IDEOGRAPH-25CD0 # + +F9A6 ; 7C3E ; MA # ( 簾 → 簾 ) CJK COMPATIBILITY IDEOGRAPH-F9A6 → CJK UNIFIED IDEOGRAPH-7C3E # + +F944 ; 7C60 ; MA # ( 籠 → 籠 ) CJK COMPATIBILITY IDEOGRAPH-F944 → CJK UNIFIED IDEOGRAPH-7C60 # + +2F76 ; 7C73 ; MA #* ( ⽶ → 米 ) KANGXI RADICAL RICE → CJK UNIFIED IDEOGRAPH-7C73 # + +FAAE ; 7C7B ; MA # ( 类 → 类 ) CJK COMPATIBILITY IDEOGRAPH-FAAE → CJK UNIFIED IDEOGRAPH-7C7B # + +F9F9 ; 7C92 ; MA # ( 粒 → 粒 ) CJK COMPATIBILITY IDEOGRAPH-F9F9 → CJK UNIFIED IDEOGRAPH-7C92 # + +FA1D ; 7CBE ; MA # ( 精 → 精 ) CJK COMPATIBILITY IDEOGRAPH-FA1D → CJK UNIFIED IDEOGRAPH-7CBE # + +2F966 ; 7CD2 ; MA # ( 糒 → 糒 ) CJK COMPATIBILITY IDEOGRAPH-2F966 → CJK UNIFIED IDEOGRAPH-7CD2 # + +FA03 ; 7CD6 ; MA # ( 糖 → 糖 ) CJK COMPATIBILITY IDEOGRAPH-FA03 → CJK UNIFIED IDEOGRAPH-7CD6 # + +2F968 ; 7CE8 ; MA # ( 糨 → 糨 ) CJK COMPATIBILITY IDEOGRAPH-2F968 → CJK UNIFIED IDEOGRAPH-7CE8 # + +2F967 ; 42A0 ; MA # ( 䊠 → 䊠 ) CJK COMPATIBILITY IDEOGRAPH-2F967 → CJK UNIFIED IDEOGRAPH-42A0 # + +2F969 ; 7CE3 ; MA # ( 糣 → 糣 ) CJK COMPATIBILITY IDEOGRAPH-2F969 → CJK UNIFIED IDEOGRAPH-7CE3 # + +F97B ; 7CE7 ; MA # ( 糧 → 糧 ) CJK COMPATIBILITY IDEOGRAPH-F97B → CJK UNIFIED IDEOGRAPH-7CE7 # + +2F77 ; 7CF8 ; MA #* ( ⽷ → 糸 ) KANGXI RADICAL SILK → CJK UNIFIED IDEOGRAPH-7CF8 # + +2EAF ; 7CF9 ; MA #* ( ⺯ → 糹 ) CJK RADICAL SILK → CJK UNIFIED IDEOGRAPH-7CF9 # + +2F96B ; 25F86 ; MA # ( 𥾆 → 𥾆 ) CJK COMPATIBILITY IDEOGRAPH-2F96B → CJK UNIFIED IDEOGRAPH-25F86 # + +2F96A ; 7D00 ; MA # ( 紀 → 紀 ) CJK COMPATIBILITY IDEOGRAPH-2F96A → CJK UNIFIED IDEOGRAPH-7D00 # + +F9CF ; 7D10 ; MA # ( 紐 → 紐 ) CJK COMPATIBILITY IDEOGRAPH-F9CF → CJK UNIFIED IDEOGRAPH-7D10 # + +F96A ; 7D22 ; MA # ( 索 → 索 ) CJK COMPATIBILITY IDEOGRAPH-F96A → CJK UNIFIED IDEOGRAPH-7D22 # + +F94F ; 7D2F ; MA # ( 累 → 累 ) CJK COMPATIBILITY IDEOGRAPH-F94F → CJK UNIFIED IDEOGRAPH-7D2F # + +7D76 ; 7D55 ; MA # ( 絶 → 絕 ) CJK UNIFIED IDEOGRAPH-7D76 → CJK UNIFIED IDEOGRAPH-7D55 # + +2F96C ; 7D63 ; MA # ( 絣 → 絣 ) CJK COMPATIBILITY IDEOGRAPH-2F96C → CJK UNIFIED IDEOGRAPH-7D63 # + +FAAF ; 7D5B ; MA # ( 絛 → 絛 ) CJK COMPATIBILITY IDEOGRAPH-FAAF → CJK UNIFIED IDEOGRAPH-7D5B # + +F93D ; 7DA0 ; MA # ( 綠 → 綠 ) CJK COMPATIBILITY IDEOGRAPH-F93D → CJK UNIFIED IDEOGRAPH-7DA0 # + +F957 ; 7DBE ; MA # ( 綾 → 綾 ) CJK COMPATIBILITY IDEOGRAPH-F957 → CJK UNIFIED IDEOGRAPH-7DBE # + +2F96E ; 7DC7 ; MA # ( 緇 → 緇 ) CJK COMPATIBILITY IDEOGRAPH-2F96E → CJK UNIFIED IDEOGRAPH-7DC7 # +31E7C ; 7DC7 ; MA # ( 𱹼 → 緇 ) CJK UNIFIED IDEOGRAPH-31E7C → CJK UNIFIED IDEOGRAPH-7DC7 # →緇→ + +F996 ; 7DF4 ; MA # ( 練 → 練 ) CJK COMPATIBILITY IDEOGRAPH-F996 → CJK UNIFIED IDEOGRAPH-7DF4 # +FA57 ; 7DF4 ; MA # ( 練 → 練 ) CJK COMPATIBILITY IDEOGRAPH-FA57 → CJK UNIFIED IDEOGRAPH-7DF4 # +FAB0 ; 7DF4 ; MA # ( 練 → 練 ) CJK COMPATIBILITY IDEOGRAPH-FAB0 → CJK UNIFIED IDEOGRAPH-7DF4 # + +2F96F ; 7E02 ; MA # ( 縂 → 縂 ) CJK COMPATIBILITY IDEOGRAPH-2F96F → CJK UNIFIED IDEOGRAPH-7E02 # + +2F96D ; 4301 ; MA # ( 䌁 → 䌁 ) CJK COMPATIBILITY IDEOGRAPH-2F96D → CJK UNIFIED IDEOGRAPH-4301 # + +FA58 ; 7E09 ; MA # ( 縉 → 縉 ) CJK COMPATIBILITY IDEOGRAPH-FA58 → CJK UNIFIED IDEOGRAPH-7E09 # + +F950 ; 7E37 ; MA # ( 縷 → 縷 ) CJK COMPATIBILITY IDEOGRAPH-F950 → CJK UNIFIED IDEOGRAPH-7E37 # + +FA59 ; 7E41 ; MA # ( 繁 → 繁 ) CJK COMPATIBILITY IDEOGRAPH-FA59 → CJK UNIFIED IDEOGRAPH-7E41 # + +2F970 ; 7E45 ; MA # ( 繅 → 繅 ) CJK COMPATIBILITY IDEOGRAPH-2F970 → CJK UNIFIED IDEOGRAPH-7E45 # + +2F898 ; 261DA ; MA # ( 𦇚 → 𦇚 ) CJK COMPATIBILITY IDEOGRAPH-2F898 → CJK UNIFIED IDEOGRAPH-261DA # + +2F971 ; 4334 ; MA # ( 䌴 → 䌴 ) CJK COMPATIBILITY IDEOGRAPH-2F971 → CJK UNIFIED IDEOGRAPH-4334 # + +2F78 ; 7F36 ; MA #* ( ⽸ → 缶 ) KANGXI RADICAL JAR → CJK UNIFIED IDEOGRAPH-7F36 # + +2F972 ; 26228 ; MA # ( 𦈨 → 𦈨 ) CJK COMPATIBILITY IDEOGRAPH-2F972 → CJK UNIFIED IDEOGRAPH-26228 # + +FAB1 ; 7F3E ; MA # ( 缾 → 缾 ) CJK COMPATIBILITY IDEOGRAPH-FAB1 → CJK UNIFIED IDEOGRAPH-7F3E # + +2F973 ; 26247 ; MA # ( 𦉇 → 𦉇 ) CJK COMPATIBILITY IDEOGRAPH-2F973 → CJK UNIFIED IDEOGRAPH-26247 # + +2F79 ; 7F51 ; MA #* ( ⽹ → 网 ) KANGXI RADICAL NET → CJK UNIFIED IDEOGRAPH-7F51 # + +2EAB ; 7F52 ; MA #* ( ⺫ → 罒 ) CJK RADICAL EYE → CJK UNIFIED IDEOGRAPH-7F52 # +2EB2 ; 7F52 ; MA #* ( ⺲ → 罒 ) CJK RADICAL NET TWO → CJK UNIFIED IDEOGRAPH-7F52 # + +2EB1 ; 7F53 ; MA #* ( ⺱ → 罓 ) CJK RADICAL NET ONE → CJK UNIFIED IDEOGRAPH-7F53 # + +2F974 ; 4359 ; MA # ( 䍙 → 䍙 ) CJK COMPATIBILITY IDEOGRAPH-2F974 → CJK UNIFIED IDEOGRAPH-4359 # + +FA5A ; 7F72 ; MA # ( 署 → 署 ) CJK COMPATIBILITY IDEOGRAPH-FA5A → CJK UNIFIED IDEOGRAPH-7F72 # + +2F975 ; 262D9 ; MA # ( 𦋙 → 𦋙 ) CJK COMPATIBILITY IDEOGRAPH-2F975 → CJK UNIFIED IDEOGRAPH-262D9 # + +F9E6 ; 7F79 ; MA # ( 罹 → 罹 ) CJK COMPATIBILITY IDEOGRAPH-F9E6 → CJK UNIFIED IDEOGRAPH-7F79 # + +2F976 ; 7F7A ; MA # ( 罺 → 罺 ) CJK COMPATIBILITY IDEOGRAPH-2F976 → CJK UNIFIED IDEOGRAPH-7F7A # + +F90F ; 7F85 ; MA # ( 羅 → 羅 ) CJK COMPATIBILITY IDEOGRAPH-F90F → CJK UNIFIED IDEOGRAPH-7F85 # + +2F977 ; 2633E ; MA # ( 𦌾 → 𦌾 ) CJK COMPATIBILITY IDEOGRAPH-2F977 → CJK UNIFIED IDEOGRAPH-2633E # + +2F7A ; 7F8A ; MA #* ( ⽺ → 羊 ) KANGXI RADICAL SHEEP → CJK UNIFIED IDEOGRAPH-7F8A # + +2F978 ; 7F95 ; MA # ( 羕 → 羕 ) CJK COMPATIBILITY IDEOGRAPH-2F978 → CJK UNIFIED IDEOGRAPH-7F95 # + +F9AF ; 7F9A ; MA # ( 羚 → 羚 ) CJK COMPATIBILITY IDEOGRAPH-F9AF → CJK UNIFIED IDEOGRAPH-7F9A # + +FA1E ; 7FBD ; MA # ( 羽 → 羽 ) CJK COMPATIBILITY IDEOGRAPH-FA1E → CJK UNIFIED IDEOGRAPH-7FBD # +2F7B ; 7FBD ; MA #* ( ⽻ → 羽 ) KANGXI RADICAL FEATHER → CJK UNIFIED IDEOGRAPH-7FBD # + +2F979 ; 7FFA ; MA # ( 翺 → 翺 ) CJK COMPATIBILITY IDEOGRAPH-2F979 → CJK UNIFIED IDEOGRAPH-7FFA # + +F934 ; 8001 ; MA # ( 老 → 老 ) CJK COMPATIBILITY IDEOGRAPH-F934 → CJK UNIFIED IDEOGRAPH-8001 # +2F7C ; 8001 ; MA #* ( ⽼ → 老 ) KANGXI RADICAL OLD → CJK UNIFIED IDEOGRAPH-8001 # + +2EB9 ; 8002 ; MA #* ( ⺹ → 耂 ) CJK RADICAL OLD → CJK UNIFIED IDEOGRAPH-8002 # + +FA5B ; 8005 ; MA # ( 者 → 者 ) CJK COMPATIBILITY IDEOGRAPH-FA5B → CJK UNIFIED IDEOGRAPH-8005 # +FAB2 ; 8005 ; MA # ( 者 → 者 ) CJK COMPATIBILITY IDEOGRAPH-FAB2 → CJK UNIFIED IDEOGRAPH-8005 # +2F97A ; 8005 ; MA # ( 者 → 者 ) CJK COMPATIBILITY IDEOGRAPH-2F97A → CJK UNIFIED IDEOGRAPH-8005 # + +2F7D ; 800C ; MA #* ( ⽽ → 而 ) KANGXI RADICAL AND → CJK UNIFIED IDEOGRAPH-800C # + +2F97B ; 264DA ; MA # ( 𦓚 → 𦓚 ) CJK COMPATIBILITY IDEOGRAPH-2F97B → CJK UNIFIED IDEOGRAPH-264DA # + +2F7E ; 8012 ; MA #* ( ⽾ → 耒 ) KANGXI RADICAL PLOW → CJK UNIFIED IDEOGRAPH-8012 # + +2F97C ; 26523 ; MA # ( 𦔣 → 𦔣 ) CJK COMPATIBILITY IDEOGRAPH-2F97C → CJK UNIFIED IDEOGRAPH-26523 # + +2F7F ; 8033 ; MA #* ( ⽿ → 耳 ) KANGXI RADICAL EAR → CJK UNIFIED IDEOGRAPH-8033 # + +F9B0 ; 8046 ; MA # ( 聆 → 聆 ) CJK COMPATIBILITY IDEOGRAPH-F9B0 → CJK UNIFIED IDEOGRAPH-8046 # + +2F97D ; 8060 ; MA # ( 聠 → 聠 ) CJK COMPATIBILITY IDEOGRAPH-2F97D → CJK UNIFIED IDEOGRAPH-8060 # + +2659D ; 265A8 ; MA # ( 𦖝 → 𦖨 ) CJK UNIFIED IDEOGRAPH-2659D → CJK UNIFIED IDEOGRAPH-265A8 # →𦖨→ +2F97E ; 265A8 ; MA # ( 𦖨 → 𦖨 ) CJK COMPATIBILITY IDEOGRAPH-2F97E → CJK UNIFIED IDEOGRAPH-265A8 # + +F997 ; 806F ; MA # ( 聯 → 聯 ) CJK COMPATIBILITY IDEOGRAPH-F997 → CJK UNIFIED IDEOGRAPH-806F # + +2F97F ; 8070 ; MA # ( 聰 → 聰 ) CJK COMPATIBILITY IDEOGRAPH-2F97F → CJK UNIFIED IDEOGRAPH-8070 # + +F945 ; 807E ; MA # ( 聾 → 聾 ) CJK COMPATIBILITY IDEOGRAPH-F945 → CJK UNIFIED IDEOGRAPH-807E # + +2F80 ; 807F ; MA #* ( ⾀ → 聿 ) KANGXI RADICAL BRUSH → CJK UNIFIED IDEOGRAPH-807F # + +2EBA ; 8080 ; MA #* ( ⺺ → 肀 ) CJK RADICAL BRUSH ONE → CJK UNIFIED IDEOGRAPH-8080 # + +2F81 ; 8089 ; MA #* ( ⾁ → 肉 ) KANGXI RADICAL MEAT → CJK UNIFIED IDEOGRAPH-8089 # + +F953 ; 808B ; MA # ( 肋 → 肋 ) CJK COMPATIBILITY IDEOGRAPH-F953 → CJK UNIFIED IDEOGRAPH-808B # + +2F8D6 ; 80AD ; MA # ( 肭 → 肭 ) CJK COMPATIBILITY IDEOGRAPH-2F8D6 → CJK UNIFIED IDEOGRAPH-80AD # + +2F982 ; 80B2 ; MA # ( 育 → 育 ) CJK COMPATIBILITY IDEOGRAPH-2F982 → CJK UNIFIED IDEOGRAPH-80B2 # + +2F981 ; 43D5 ; MA # ( 䏕 → 䏕 ) CJK COMPATIBILITY IDEOGRAPH-2F981 → CJK UNIFIED IDEOGRAPH-43D5 # + +2F8D7 ; 43D9 ; MA # ( 䏙 → 䏙 ) CJK COMPATIBILITY IDEOGRAPH-2F8D7 → CJK UNIFIED IDEOGRAPH-43D9 # + +8141 ; 80FC ; MA # ( 腁 → 胼 ) CJK UNIFIED IDEOGRAPH-8141 → CJK UNIFIED IDEOGRAPH-80FC # + +2F983 ; 8103 ; MA # ( 脃 → 脃 ) CJK COMPATIBILITY IDEOGRAPH-2F983 → CJK UNIFIED IDEOGRAPH-8103 # + +2F985 ; 813E ; MA # ( 脾 → 脾 ) CJK COMPATIBILITY IDEOGRAPH-2F985 → CJK UNIFIED IDEOGRAPH-813E # + +2F984 ; 440B ; MA # ( 䐋 → 䐋 ) CJK COMPATIBILITY IDEOGRAPH-2F984 → CJK UNIFIED IDEOGRAPH-440B # + +2F8DA ; 6721 ; MA # ( 朡 → 朡 ) CJK COMPATIBILITY IDEOGRAPH-2F8DA → CJK UNIFIED IDEOGRAPH-6721 # + +2F987 ; 267A7 ; MA # ( 𦞧 → 𦞧 ) CJK COMPATIBILITY IDEOGRAPH-2F987 → CJK UNIFIED IDEOGRAPH-267A7 # + +2F988 ; 267B5 ; MA # ( 𦞵 → 𦞵 ) CJK COMPATIBILITY IDEOGRAPH-2F988 → CJK UNIFIED IDEOGRAPH-267B5 # + +6726 ; 4443 ; MA # ( 朦 → 䑃 ) CJK UNIFIED IDEOGRAPH-6726 → CJK UNIFIED IDEOGRAPH-4443 # + +F926 ; 81D8 ; MA # ( 臘 → 臘 ) CJK COMPATIBILITY IDEOGRAPH-F926 → CJK UNIFIED IDEOGRAPH-81D8 # + +2F82 ; 81E3 ; MA #* ( ⾂ → 臣 ) KANGXI RADICAL MINISTER → CJK UNIFIED IDEOGRAPH-81E3 # + +F9F6 ; 81E8 ; MA # ( 臨 → 臨 ) CJK COMPATIBILITY IDEOGRAPH-F9F6 → CJK UNIFIED IDEOGRAPH-81E8 # + +2F83 ; 81EA ; MA #* ( ⾃ → 自 ) KANGXI RADICAL SELF → CJK UNIFIED IDEOGRAPH-81EA # + +FA5C ; 81ED ; MA # ( 臭 → 臭 ) CJK COMPATIBILITY IDEOGRAPH-FA5C → CJK UNIFIED IDEOGRAPH-81ED # + +2F84 ; 81F3 ; MA #* ( ⾄ → 至 ) KANGXI RADICAL ARRIVE → CJK UNIFIED IDEOGRAPH-81F3 # + +2F85 ; 81FC ; MA #* ( ⾅ → 臼 ) KANGXI RADICAL MORTAR → CJK UNIFIED IDEOGRAPH-81FC # + +2F893 ; 8201 ; MA # ( 舁 → 舁 ) CJK COMPATIBILITY IDEOGRAPH-2F893 → CJK UNIFIED IDEOGRAPH-8201 # +2F98B ; 8201 ; MA # ( 舁 → 舁 ) CJK COMPATIBILITY IDEOGRAPH-2F98B → CJK UNIFIED IDEOGRAPH-8201 # + +2F98C ; 8204 ; MA # ( 舄 → 舄 ) CJK COMPATIBILITY IDEOGRAPH-2F98C → CJK UNIFIED IDEOGRAPH-8204 # + +2F86 ; 820C ; MA #* ( ⾆ → 舌 ) KANGXI RADICAL TONGUE → CJK UNIFIED IDEOGRAPH-820C # + +FA6D ; 8218 ; MA # ( 舘 → 舘 ) CJK COMPATIBILITY IDEOGRAPH-FA6D → CJK UNIFIED IDEOGRAPH-8218 # + +2F87 ; 821B ; MA #* ( ⾇ → 舛 ) KANGXI RADICAL OPPOSE → CJK UNIFIED IDEOGRAPH-821B # + +2F88 ; 821F ; MA #* ( ⾈ → 舟 ) KANGXI RADICAL BOAT → CJK UNIFIED IDEOGRAPH-821F # + +2F98E ; 446B ; MA # ( 䑫 → 䑫 ) CJK COMPATIBILITY IDEOGRAPH-2F98E → CJK UNIFIED IDEOGRAPH-446B # + +2F89 ; 826E ; MA #* ( ⾉ → 艮 ) KANGXI RADICAL STOPPING → CJK UNIFIED IDEOGRAPH-826E # + +F97C ; 826F ; MA # ( 良 → 良 ) CJK COMPATIBILITY IDEOGRAPH-F97C → CJK UNIFIED IDEOGRAPH-826F # + +2F8A ; 8272 ; MA #* ( ⾊ → 色 ) KANGXI RADICAL COLOR → CJK UNIFIED IDEOGRAPH-8272 # + +2F8B ; 8278 ; MA #* ( ⾋ → 艸 ) KANGXI RADICAL GRASS → CJK UNIFIED IDEOGRAPH-8278 # + +FA5D ; 8279 ; MA # ( 艹 → 艹 ) CJK COMPATIBILITY IDEOGRAPH-FA5D → CJK UNIFIED IDEOGRAPH-8279 # +FA5E ; 8279 ; MA # ( 艹 → 艹 ) CJK COMPATIBILITY IDEOGRAPH-FA5E → CJK UNIFIED IDEOGRAPH-8279 # +2EBE ; 8279 ; MA #* ( ⺾ → 艹 ) CJK RADICAL GRASS ONE → CJK UNIFIED IDEOGRAPH-8279 # +2EBF ; 8279 ; MA #* ( ⺿ → 艹 ) CJK RADICAL GRASS TWO → CJK UNIFIED IDEOGRAPH-8279 # →艹→ +2EC0 ; 8279 ; MA #* ( ⻀ → 艹 ) CJK RADICAL GRASS THREE → CJK UNIFIED IDEOGRAPH-8279 # →艹→ + +2F990 ; 828B ; MA # ( 芋 → 芋 ) CJK COMPATIBILITY IDEOGRAPH-2F990 → CJK UNIFIED IDEOGRAPH-828B # + +2F98F ; 8291 ; MA # ( 芑 → 芑 ) CJK COMPATIBILITY IDEOGRAPH-2F98F → CJK UNIFIED IDEOGRAPH-8291 # + +2F991 ; 829D ; MA # ( 芝 → 芝 ) CJK COMPATIBILITY IDEOGRAPH-2F991 → CJK UNIFIED IDEOGRAPH-829D # + +2F993 ; 82B1 ; MA # ( 花 → 花 ) CJK COMPATIBILITY IDEOGRAPH-2F993 → CJK UNIFIED IDEOGRAPH-82B1 # + +2F994 ; 82B3 ; MA # ( 芳 → 芳 ) CJK COMPATIBILITY IDEOGRAPH-2F994 → CJK UNIFIED IDEOGRAPH-82B3 # + +2F995 ; 82BD ; MA # ( 芽 → 芽 ) CJK COMPATIBILITY IDEOGRAPH-2F995 → CJK UNIFIED IDEOGRAPH-82BD # + +F974 ; 82E5 ; MA # ( 若 → 若 ) CJK COMPATIBILITY IDEOGRAPH-F974 → CJK UNIFIED IDEOGRAPH-82E5 # +2F998 ; 82E5 ; MA # ( 若 → 若 ) CJK COMPATIBILITY IDEOGRAPH-2F998 → CJK UNIFIED IDEOGRAPH-82E5 # + +2F996 ; 82E6 ; MA # ( 苦 → 苦 ) CJK COMPATIBILITY IDEOGRAPH-2F996 → CJK UNIFIED IDEOGRAPH-82E6 # + +2F997 ; 26B3C ; MA # ( 𦬼 → 𦬼 ) CJK COMPATIBILITY IDEOGRAPH-2F997 → CJK UNIFIED IDEOGRAPH-26B3C # + +F9FE ; 8336 ; MA # ( 茶 → 茶 ) CJK COMPATIBILITY IDEOGRAPH-F9FE → CJK UNIFIED IDEOGRAPH-8336 # + +FAB3 ; 8352 ; MA # ( 荒 → 荒 ) CJK COMPATIBILITY IDEOGRAPH-FAB3 → CJK UNIFIED IDEOGRAPH-8352 # + +2F99A ; 8363 ; MA # ( 荣 → 荣 ) CJK COMPATIBILITY IDEOGRAPH-2F99A → CJK UNIFIED IDEOGRAPH-8363 # + +2F999 ; 831D ; MA # ( 茝 → 茝 ) CJK COMPATIBILITY IDEOGRAPH-2F999 → CJK UNIFIED IDEOGRAPH-831D # + +2F99C ; 8323 ; MA # ( 茣 → 茣 ) CJK COMPATIBILITY IDEOGRAPH-2F99C → CJK UNIFIED IDEOGRAPH-8323 # + +2F99D ; 83BD ; MA # ( 莽 → 莽 ) CJK COMPATIBILITY IDEOGRAPH-2F99D → CJK UNIFIED IDEOGRAPH-83BD # + +2F9A0 ; 8353 ; MA # ( 荓 → 荓 ) CJK COMPATIBILITY IDEOGRAPH-2F9A0 → CJK UNIFIED IDEOGRAPH-8353 # + +F93E ; 83C9 ; MA # ( 菉 → 菉 ) CJK COMPATIBILITY IDEOGRAPH-F93E → CJK UNIFIED IDEOGRAPH-83C9 # + +2F9A1 ; 83CA ; MA # ( 菊 → 菊 ) CJK COMPATIBILITY IDEOGRAPH-2F9A1 → CJK UNIFIED IDEOGRAPH-83CA # + +2F9A2 ; 83CC ; MA # ( 菌 → 菌 ) CJK COMPATIBILITY IDEOGRAPH-2F9A2 → CJK UNIFIED IDEOGRAPH-83CC # + +2F9A3 ; 83DC ; MA # ( 菜 → 菜 ) CJK COMPATIBILITY IDEOGRAPH-2F9A3 → CJK UNIFIED IDEOGRAPH-83DC # + +2F99E ; 83E7 ; MA # ( 菧 → 菧 ) CJK COMPATIBILITY IDEOGRAPH-2F99E → CJK UNIFIED IDEOGRAPH-83E7 # + +FAB4 ; 83EF ; MA # ( 華 → 華 ) CJK COMPATIBILITY IDEOGRAPH-FAB4 → CJK UNIFIED IDEOGRAPH-83EF # + +F958 ; 83F1 ; MA # ( 菱 → 菱 ) CJK COMPATIBILITY IDEOGRAPH-F958 → CJK UNIFIED IDEOGRAPH-83F1 # + +FA5F ; 8457 ; MA # ( 著 → 著 ) CJK COMPATIBILITY IDEOGRAPH-FA5F → CJK UNIFIED IDEOGRAPH-8457 # +2F99F ; 8457 ; MA # ( 著 → 著 ) CJK COMPATIBILITY IDEOGRAPH-2F99F → CJK UNIFIED IDEOGRAPH-8457 # + +2F9A4 ; 26C36 ; MA # ( 𦰶 → 𦰶 ) CJK COMPATIBILITY IDEOGRAPH-2F9A4 → CJK UNIFIED IDEOGRAPH-26C36 # +26D06 ; 26C36 ; MA # ( 𦴆 → 𦰶 ) CJK UNIFIED IDEOGRAPH-26D06 → CJK UNIFIED IDEOGRAPH-26C36 # →𦰶→ + +2F99B ; 83AD ; MA # ( 莭 → 莭 ) CJK COMPATIBILITY IDEOGRAPH-2F99B → CJK UNIFIED IDEOGRAPH-83AD # + +F918 ; 843D ; MA # ( 落 → 落 ) CJK COMPATIBILITY IDEOGRAPH-F918 → CJK UNIFIED IDEOGRAPH-843D # + +F96E ; 8449 ; MA # ( 葉 → 葉 ) CJK COMPATIBILITY IDEOGRAPH-F96E → CJK UNIFIED IDEOGRAPH-8449 # + +853F ; 848D ; MA # ( 蔿 → 蒍 ) CJK UNIFIED IDEOGRAPH-853F → CJK UNIFIED IDEOGRAPH-848D # + +2F9A6 ; 26CD5 ; MA # ( 𦳕 → 𦳕 ) CJK COMPATIBILITY IDEOGRAPH-2F9A6 → CJK UNIFIED IDEOGRAPH-26CD5 # + +2F9A5 ; 26D6B ; MA # ( 𦵫 → 𦵫 ) CJK COMPATIBILITY IDEOGRAPH-2F9A5 → CJK UNIFIED IDEOGRAPH-26D6B # + +F999 ; 84EE ; MA # ( 蓮 → 蓮 ) CJK COMPATIBILITY IDEOGRAPH-F999 → CJK UNIFIED IDEOGRAPH-84EE # + +2F9A8 ; 84F1 ; MA # ( 蓱 → 蓱 ) CJK COMPATIBILITY IDEOGRAPH-2F9A8 → CJK UNIFIED IDEOGRAPH-84F1 # + +2F9A9 ; 84F3 ; MA # ( 蓳 → 蓳 ) CJK COMPATIBILITY IDEOGRAPH-2F9A9 → CJK UNIFIED IDEOGRAPH-84F3 # + +F9C2 ; 84FC ; MA # ( 蓼 → 蓼 ) CJK COMPATIBILITY IDEOGRAPH-F9C2 → CJK UNIFIED IDEOGRAPH-84FC # + +2F9AA ; 8516 ; MA # ( 蔖 → 蔖 ) CJK COMPATIBILITY IDEOGRAPH-2F9AA → CJK UNIFIED IDEOGRAPH-8516 # + +2F9A7 ; 452B ; MA # ( 䔫 → 䔫 ) CJK COMPATIBILITY IDEOGRAPH-2F9A7 → CJK UNIFIED IDEOGRAPH-452B # + +2F9AC ; 8564 ; MA # ( 蕤 → 蕤 ) CJK COMPATIBILITY IDEOGRAPH-2F9AC → CJK UNIFIED IDEOGRAPH-8564 # + +2F9AD ; 26F2C ; MA # ( 𦼬 → 𦼬 ) CJK COMPATIBILITY IDEOGRAPH-2F9AD → CJK UNIFIED IDEOGRAPH-26F2C # + +F923 ; 85CD ; MA # ( 藍 → 藍 ) CJK COMPATIBILITY IDEOGRAPH-F923 → CJK UNIFIED IDEOGRAPH-85CD # + +2F9AE ; 455D ; MA # ( 䕝 → 䕝 ) CJK COMPATIBILITY IDEOGRAPH-2F9AE → CJK UNIFIED IDEOGRAPH-455D # + +2F9B0 ; 26FB1 ; MA # ( 𦾱 → 𦾱 ) CJK COMPATIBILITY IDEOGRAPH-2F9B0 → CJK UNIFIED IDEOGRAPH-26FB1 # + +2F9AF ; 4561 ; MA # ( 䕡 → 䕡 ) CJK COMPATIBILITY IDEOGRAPH-2F9AF → CJK UNIFIED IDEOGRAPH-4561 # + +F9F0 ; 85FA ; MA # ( 藺 → 藺 ) CJK COMPATIBILITY IDEOGRAPH-F9F0 → CJK UNIFIED IDEOGRAPH-85FA # + +F935 ; 8606 ; MA # ( 蘆 → 蘆 ) CJK COMPATIBILITY IDEOGRAPH-F935 → CJK UNIFIED IDEOGRAPH-8606 # + +2F9B2 ; 456B ; MA # ( 䕫 → 䕫 ) CJK COMPATIBILITY IDEOGRAPH-2F9B2 → CJK UNIFIED IDEOGRAPH-456B # + +FA20 ; 8612 ; MA # ( 蘒 → 蘒 ) CJK COMPATIBILITY IDEOGRAPH-FA20 → CJK UNIFIED IDEOGRAPH-8612 # + +F91F ; 862D ; MA # ( 蘭 → 蘭 ) CJK COMPATIBILITY IDEOGRAPH-F91F → CJK UNIFIED IDEOGRAPH-862D # + +2F9B1 ; 270D2 ; MA # ( 𧃒 → 𧃒 ) CJK COMPATIBILITY IDEOGRAPH-2F9B1 → CJK UNIFIED IDEOGRAPH-270D2 # + +8641 ; 8637 ; MA # ( 虁 → 蘷 ) CJK UNIFIED IDEOGRAPH-8641 → CJK UNIFIED IDEOGRAPH-8637 # + +F910 ; 863F ; MA # ( 蘿 → 蘿 ) CJK COMPATIBILITY IDEOGRAPH-F910 → CJK UNIFIED IDEOGRAPH-863F # + +2F8C ; 864D ; MA #* ( ⾌ → 虍 ) KANGXI RADICAL TIGER → CJK UNIFIED IDEOGRAPH-864D # + +2EC1 ; 864E ; MA #* ( ⻁ → 虎 ) CJK RADICAL TIGER → CJK UNIFIED IDEOGRAPH-864E # + +2F9B3 ; 8650 ; MA # ( 虐 → 虐 ) CJK COMPATIBILITY IDEOGRAPH-2F9B3 → CJK UNIFIED IDEOGRAPH-8650 # + +F936 ; 865C ; MA # ( 虜 → 虜 ) CJK COMPATIBILITY IDEOGRAPH-F936 → CJK UNIFIED IDEOGRAPH-865C # +2F9B4 ; 865C ; MA # ( 虜 → 虜 ) CJK COMPATIBILITY IDEOGRAPH-2F9B4 → CJK UNIFIED IDEOGRAPH-865C # + +2F9B5 ; 8667 ; MA # ( 虧 → 虧 ) CJK COMPATIBILITY IDEOGRAPH-2F9B5 → CJK UNIFIED IDEOGRAPH-8667 # + +2F9B6 ; 8669 ; MA # ( 虩 → 虩 ) CJK COMPATIBILITY IDEOGRAPH-2F9B6 → CJK UNIFIED IDEOGRAPH-8669 # + +2F8D ; 866B ; MA #* ( ⾍ → 虫 ) KANGXI RADICAL INSECT → CJK UNIFIED IDEOGRAPH-866B # + +2F9B7 ; 86A9 ; MA # ( 蚩 → 蚩 ) CJK COMPATIBILITY IDEOGRAPH-2F9B7 → CJK UNIFIED IDEOGRAPH-86A9 # + +2F9B8 ; 8688 ; MA # ( 蚈 → 蚈 ) CJK COMPATIBILITY IDEOGRAPH-2F9B8 → CJK UNIFIED IDEOGRAPH-8688 # + +2F9BA ; 86E2 ; MA # ( 蛢 → 蛢 ) CJK COMPATIBILITY IDEOGRAPH-2F9BA → CJK UNIFIED IDEOGRAPH-86E2 # + +2F9B9 ; 870E ; MA # ( 蜎 → 蜎 ) CJK COMPATIBILITY IDEOGRAPH-2F9B9 → CJK UNIFIED IDEOGRAPH-870E # + +2F9BC ; 8728 ; MA # ( 蜨 → 蜨 ) CJK COMPATIBILITY IDEOGRAPH-2F9BC → CJK UNIFIED IDEOGRAPH-8728 # + +2F9BD ; 876B ; MA # ( 蝫 → 蝫 ) CJK COMPATIBILITY IDEOGRAPH-2F9BD → CJK UNIFIED IDEOGRAPH-876B # + +2F9C0 ; 87E1 ; MA # ( 蟡 → 蟡 ) CJK COMPATIBILITY IDEOGRAPH-2F9C0 → CJK UNIFIED IDEOGRAPH-87E1 # + +FAB5 ; 8779 ; MA # ( 蝹 → 蝹 ) CJK COMPATIBILITY IDEOGRAPH-FAB5 → CJK UNIFIED IDEOGRAPH-8779 # +2F9BB ; 8779 ; MA # ( 蝹 → 蝹 ) CJK COMPATIBILITY IDEOGRAPH-2F9BB → CJK UNIFIED IDEOGRAPH-8779 # + +2F9BE ; 8786 ; MA # ( 螆 → 螆 ) CJK COMPATIBILITY IDEOGRAPH-2F9BE → CJK UNIFIED IDEOGRAPH-8786 # + +2F9BF ; 45D7 ; MA # ( 䗗 → 䗗 ) CJK COMPATIBILITY IDEOGRAPH-2F9BF → CJK UNIFIED IDEOGRAPH-45D7 # + +2F9AB ; 273CA ; MA # ( 𧏊 → 𧏊 ) CJK COMPATIBILITY IDEOGRAPH-2F9AB → CJK UNIFIED IDEOGRAPH-273CA # + +F911 ; 87BA ; MA # ( 螺 → 螺 ) CJK COMPATIBILITY IDEOGRAPH-F911 → CJK UNIFIED IDEOGRAPH-87BA # + +2F9C1 ; 8801 ; MA # ( 蠁 → 蠁 ) CJK COMPATIBILITY IDEOGRAPH-2F9C1 → CJK UNIFIED IDEOGRAPH-8801 # + +2F9C2 ; 45F9 ; MA # ( 䗹 → 䗹 ) CJK COMPATIBILITY IDEOGRAPH-2F9C2 → CJK UNIFIED IDEOGRAPH-45F9 # + +F927 ; 881F ; MA # ( 蠟 → 蠟 ) CJK COMPATIBILITY IDEOGRAPH-F927 → CJK UNIFIED IDEOGRAPH-881F # + +2F8E ; 8840 ; MA #* ( ⾎ → 血 ) KANGXI RADICAL BLOOD → CJK UNIFIED IDEOGRAPH-8840 # + +FA08 ; 884C ; MA # ( 行 → 行 ) CJK COMPATIBILITY IDEOGRAPH-FA08 → CJK UNIFIED IDEOGRAPH-884C # +2F8F ; 884C ; MA #* ( ⾏ → 行 ) KANGXI RADICAL WALK ENCLOSURE → CJK UNIFIED IDEOGRAPH-884C # + +2F9C3 ; 8860 ; MA # ( 衠 → 衠 ) CJK COMPATIBILITY IDEOGRAPH-2F9C3 → CJK UNIFIED IDEOGRAPH-8860 # + +2F9C4 ; 8863 ; MA # ( 衣 → 衣 ) CJK COMPATIBILITY IDEOGRAPH-2F9C4 → CJK UNIFIED IDEOGRAPH-8863 # +2F90 ; 8863 ; MA #* ( ⾐ → 衣 ) KANGXI RADICAL CLOTHES → CJK UNIFIED IDEOGRAPH-8863 # + +2EC2 ; 8864 ; MA #* ( ⻂ → 衤 ) CJK RADICAL CLOTHES → CJK UNIFIED IDEOGRAPH-8864 # + +F9A0 ; 88C2 ; MA # ( 裂 → 裂 ) CJK COMPATIBILITY IDEOGRAPH-F9A0 → CJK UNIFIED IDEOGRAPH-88C2 # + +2F9C5 ; 27667 ; MA # ( 𧙧 → 𧙧 ) CJK COMPATIBILITY IDEOGRAPH-2F9C5 → CJK UNIFIED IDEOGRAPH-27667 # + +F9E7 ; 88CF ; MA # ( 裏 → 裏 ) CJK COMPATIBILITY IDEOGRAPH-F9E7 → CJK UNIFIED IDEOGRAPH-88CF # + +2F9C6 ; 88D7 ; MA # ( 裗 → 裗 ) CJK COMPATIBILITY IDEOGRAPH-2F9C6 → CJK UNIFIED IDEOGRAPH-88D7 # + +2F9C7 ; 88DE ; MA # ( 裞 → 裞 ) CJK COMPATIBILITY IDEOGRAPH-2F9C7 → CJK UNIFIED IDEOGRAPH-88DE # + +F9E8 ; 88E1 ; MA # ( 裡 → 裡 ) CJK COMPATIBILITY IDEOGRAPH-F9E8 → CJK UNIFIED IDEOGRAPH-88E1 # + +F912 ; 88F8 ; MA # ( 裸 → 裸 ) CJK COMPATIBILITY IDEOGRAPH-F912 → CJK UNIFIED IDEOGRAPH-88F8 # + +2F9C9 ; 88FA ; MA # ( 裺 → 裺 ) CJK COMPATIBILITY IDEOGRAPH-2F9C9 → CJK UNIFIED IDEOGRAPH-88FA # + +2F9C8 ; 4635 ; MA # ( 䘵 → 䘵 ) CJK COMPATIBILITY IDEOGRAPH-2F9C8 → CJK UNIFIED IDEOGRAPH-4635 # + +FA60 ; 8910 ; MA # ( 褐 → 褐 ) CJK COMPATIBILITY IDEOGRAPH-FA60 → CJK UNIFIED IDEOGRAPH-8910 # + +FAB6 ; 8941 ; MA # ( 襁 → 襁 ) CJK COMPATIBILITY IDEOGRAPH-FAB6 → CJK UNIFIED IDEOGRAPH-8941 # + +F924 ; 8964 ; MA # ( 襤 → 襤 ) CJK COMPATIBILITY IDEOGRAPH-F924 → CJK UNIFIED IDEOGRAPH-8964 # + +2F91 ; 897E ; MA #* ( ⾑ → 襾 ) KANGXI RADICAL WEST → CJK UNIFIED IDEOGRAPH-897E # + +2EC4 ; 897F ; MA #* ( ⻄ → 西 ) CJK RADICAL WEST TWO → CJK UNIFIED IDEOGRAPH-897F # + +2EC3 ; 8980 ; MA #* ( ⻃ → 覀 ) CJK RADICAL WEST ONE → CJK UNIFIED IDEOGRAPH-8980 # + +FAB7 ; 8986 ; MA # ( 覆 → 覆 ) CJK COMPATIBILITY IDEOGRAPH-FAB7 → CJK UNIFIED IDEOGRAPH-8986 # + +FA0A ; 898B ; MA # ( 見 → 見 ) CJK COMPATIBILITY IDEOGRAPH-FA0A → CJK UNIFIED IDEOGRAPH-898B # +2F92 ; 898B ; MA #* ( ⾒ → 見 ) KANGXI RADICAL SEE → CJK UNIFIED IDEOGRAPH-898B # + +2EC5 ; 89C1 ; MA #* ( ⻅ → 见 ) CJK RADICAL C-SIMPLIFIED SEE → CJK UNIFIED IDEOGRAPH-89C1 # + +4695 ; 278AE ; MA # ( 䚕 → 𧢮 ) CJK UNIFIED IDEOGRAPH-4695 → CJK UNIFIED IDEOGRAPH-278AE # →𧢮→ +2F9CB ; 278AE ; MA # ( 𧢮 → 𧢮 ) CJK COMPATIBILITY IDEOGRAPH-2F9CB → CJK UNIFIED IDEOGRAPH-278AE # + +2F93 ; 89D2 ; MA #* ( ⾓ → 角 ) KANGXI RADICAL HORN → CJK UNIFIED IDEOGRAPH-89D2 # + +2F94 ; 8A00 ; MA #* ( ⾔ → 言 ) KANGXI RADICAL SPEECH → CJK UNIFIED IDEOGRAPH-8A00 # + +2EC8 ; 8BA0 ; MA #* ( ⻈ → 讠 ) CJK RADICAL C-SIMPLIFIED SPEECH → CJK UNIFIED IDEOGRAPH-8BA0 # + +2F9CC ; 27966 ; MA # ( 𧥦 → 𧥦 ) CJK COMPATIBILITY IDEOGRAPH-2F9CC → CJK UNIFIED IDEOGRAPH-27966 # + +8A7D ; 8A2E ; MA # ( 詽 → 訮 ) CJK UNIFIED IDEOGRAPH-8A7D → CJK UNIFIED IDEOGRAPH-8A2E # + +8A1E ; 46B6 ; MA # ( 訞 → 䚶 ) CJK UNIFIED IDEOGRAPH-8A1E → CJK UNIFIED IDEOGRAPH-46B6 # + +2F9CD ; 46BE ; MA # ( 䚾 → 䚾 ) CJK COMPATIBILITY IDEOGRAPH-2F9CD → CJK UNIFIED IDEOGRAPH-46BE # + +2F9CE ; 46C7 ; MA # ( 䛇 → 䛇 ) CJK COMPATIBILITY IDEOGRAPH-2F9CE → CJK UNIFIED IDEOGRAPH-46C7 # + +2F9CF ; 8AA0 ; MA # ( 誠 → 誠 ) CJK COMPATIBILITY IDEOGRAPH-2F9CF → CJK UNIFIED IDEOGRAPH-8AA0 # + +F96F ; 8AAA ; MA # ( 說 → 說 ) CJK COMPATIBILITY IDEOGRAPH-F96F → CJK UNIFIED IDEOGRAPH-8AAA # +F9A1 ; 8AAA ; MA # ( 說 → 說 ) CJK COMPATIBILITY IDEOGRAPH-F9A1 → CJK UNIFIED IDEOGRAPH-8AAA # + +FAB9 ; 8ABF ; MA # ( 調 → 調 ) CJK COMPATIBILITY IDEOGRAPH-FAB9 → CJK UNIFIED IDEOGRAPH-8ABF # + +FABB ; 8ACB ; MA # ( 請 → 請 ) CJK COMPATIBILITY IDEOGRAPH-FABB → CJK UNIFIED IDEOGRAPH-8ACB # + +F97D ; 8AD2 ; MA # ( 諒 → 諒 ) CJK COMPATIBILITY IDEOGRAPH-F97D → CJK UNIFIED IDEOGRAPH-8AD2 # + +F941 ; 8AD6 ; MA # ( 論 → 論 ) CJK COMPATIBILITY IDEOGRAPH-F941 → CJK UNIFIED IDEOGRAPH-8AD6 # + +FABE ; 8AED ; MA # ( 諭 → 諭 ) CJK COMPATIBILITY IDEOGRAPH-FABE → CJK UNIFIED IDEOGRAPH-8AED # +2F9D0 ; 8AED ; MA # ( 諭 → 諭 ) CJK COMPATIBILITY IDEOGRAPH-2F9D0 → CJK UNIFIED IDEOGRAPH-8AED # + +FA22 ; 8AF8 ; MA # ( 諸 → 諸 ) CJK COMPATIBILITY IDEOGRAPH-FA22 → CJK UNIFIED IDEOGRAPH-8AF8 # +FABA ; 8AF8 ; MA # ( 諸 → 諸 ) CJK COMPATIBILITY IDEOGRAPH-FABA → CJK UNIFIED IDEOGRAPH-8AF8 # + +F95D ; 8AFE ; MA # ( 諾 → 諾 ) CJK COMPATIBILITY IDEOGRAPH-F95D → CJK UNIFIED IDEOGRAPH-8AFE # +FABD ; 8AFE ; MA # ( 諾 → 諾 ) CJK COMPATIBILITY IDEOGRAPH-FABD → CJK UNIFIED IDEOGRAPH-8AFE # + +FA62 ; 8B01 ; MA # ( 謁 → 謁 ) CJK COMPATIBILITY IDEOGRAPH-FA62 → CJK UNIFIED IDEOGRAPH-8B01 # +FABC ; 8B01 ; MA # ( 謁 → 謁 ) CJK COMPATIBILITY IDEOGRAPH-FABC → CJK UNIFIED IDEOGRAPH-8B01 # + +FA63 ; 8B39 ; MA # ( 謹 → 謹 ) CJK COMPATIBILITY IDEOGRAPH-FA63 → CJK UNIFIED IDEOGRAPH-8B39 # +FABF ; 8B39 ; MA # ( 謹 → 謹 ) CJK COMPATIBILITY IDEOGRAPH-FABF → CJK UNIFIED IDEOGRAPH-8B39 # + +F9FC ; 8B58 ; MA # ( 識 → 識 ) CJK COMPATIBILITY IDEOGRAPH-F9FC → CJK UNIFIED IDEOGRAPH-8B58 # + +F95A ; 8B80 ; MA # ( 讀 → 讀 ) CJK COMPATIBILITY IDEOGRAPH-F95A → CJK UNIFIED IDEOGRAPH-8B80 # + +8B8F ; 8B86 ; MA # ( 讏 → 讆 ) CJK UNIFIED IDEOGRAPH-8B8F → CJK UNIFIED IDEOGRAPH-8B86 # + +FAC0 ; 8B8A ; MA # ( 變 → 變 ) CJK COMPATIBILITY IDEOGRAPH-FAC0 → CJK UNIFIED IDEOGRAPH-8B8A # +2F9D1 ; 8B8A ; MA # ( 變 → 變 ) CJK COMPATIBILITY IDEOGRAPH-2F9D1 → CJK UNIFIED IDEOGRAPH-8B8A # + +2F95 ; 8C37 ; MA #* ( ⾕ → 谷 ) KANGXI RADICAL VALLEY → CJK UNIFIED IDEOGRAPH-8C37 # + +2F96 ; 8C46 ; MA #* ( ⾖ → 豆 ) KANGXI RADICAL BEAN → CJK UNIFIED IDEOGRAPH-8C46 # + +F900 ; 8C48 ; MA # ( 豈 → 豈 ) CJK COMPATIBILITY IDEOGRAPH-F900 → CJK UNIFIED IDEOGRAPH-8C48 # + +2F9D2 ; 8C55 ; MA # ( 豕 → 豕 ) CJK COMPATIBILITY IDEOGRAPH-2F9D2 → CJK UNIFIED IDEOGRAPH-8C55 # +2F97 ; 8C55 ; MA #* ( ⾗ → 豕 ) KANGXI RADICAL PIG → CJK UNIFIED IDEOGRAPH-8C55 # + +8C63 ; 8C5C ; MA # ( 豣 → 豜 ) CJK UNIFIED IDEOGRAPH-8C63 → CJK UNIFIED IDEOGRAPH-8C5C # + +2F98 ; 8C78 ; MA #* ( ⾘ → 豸 ) KANGXI RADICAL BADGER → CJK UNIFIED IDEOGRAPH-8C78 # + +2F9D3 ; 27CA8 ; MA # ( 𧲨 → 𧲨 ) CJK COMPATIBILITY IDEOGRAPH-2F9D3 → CJK UNIFIED IDEOGRAPH-27CA8 # + +2F99 ; 8C9D ; MA #* ( ⾙ → 貝 ) KANGXI RADICAL SHELL → CJK UNIFIED IDEOGRAPH-8C9D # + +2EC9 ; 8D1D ; MA #* ( ⻉ → 贝 ) CJK RADICAL C-SIMPLIFIED SHELL → CJK UNIFIED IDEOGRAPH-8D1D # + +2F9D4 ; 8CAB ; MA # ( 貫 → 貫 ) CJK COMPATIBILITY IDEOGRAPH-2F9D4 → CJK UNIFIED IDEOGRAPH-8CAB # + +2F9D5 ; 8CC1 ; MA # ( 賁 → 賁 ) CJK COMPATIBILITY IDEOGRAPH-2F9D5 → CJK UNIFIED IDEOGRAPH-8CC1 # + +F948 ; 8CC2 ; MA # ( 賂 → 賂 ) CJK COMPATIBILITY IDEOGRAPH-F948 → CJK UNIFIED IDEOGRAPH-8CC2 # + +F903 ; 8CC8 ; MA # ( 賈 → 賈 ) CJK COMPATIBILITY IDEOGRAPH-F903 → CJK UNIFIED IDEOGRAPH-8CC8 # + +FA64 ; 8CD3 ; MA # ( 賓 → 賓 ) CJK COMPATIBILITY IDEOGRAPH-FA64 → CJK UNIFIED IDEOGRAPH-8CD3 # + +FA65 ; 8D08 ; MA # ( 贈 → 贈 ) CJK COMPATIBILITY IDEOGRAPH-FA65 → CJK UNIFIED IDEOGRAPH-8D08 # +FAC1 ; 8D08 ; MA # ( 贈 → 贈 ) CJK COMPATIBILITY IDEOGRAPH-FAC1 → CJK UNIFIED IDEOGRAPH-8D08 # + +25AD4 ; 8D1B ; MA # ( 𥫔 → 贛 ) CJK UNIFIED IDEOGRAPH-25AD4 → CJK UNIFIED IDEOGRAPH-8D1B # →贛→ +2F9D6 ; 8D1B ; MA # ( 贛 → 贛 ) CJK COMPATIBILITY IDEOGRAPH-2F9D6 → CJK UNIFIED IDEOGRAPH-8D1B # + +2F9A ; 8D64 ; MA #* ( ⾚ → 赤 ) KANGXI RADICAL RED → CJK UNIFIED IDEOGRAPH-8D64 # + +2F9B ; 8D70 ; MA #* ( ⾛ → 走 ) KANGXI RADICAL RUN → CJK UNIFIED IDEOGRAPH-8D70 # + +2F9D7 ; 8D77 ; MA # ( 起 → 起 ) CJK COMPATIBILITY IDEOGRAPH-2F9D7 → CJK UNIFIED IDEOGRAPH-8D77 # + +8D86 ; 8D7F ; MA # ( 趆 → 赿 ) CJK UNIFIED IDEOGRAPH-8D86 → CJK UNIFIED IDEOGRAPH-8D7F # + +FAD7 ; 27ED3 ; MA # ( 𧻓 → 𧻓 ) CJK COMPATIBILITY IDEOGRAPH-FAD7 → CJK UNIFIED IDEOGRAPH-27ED3 # + +2F9D8 ; 27F2F ; MA # ( 𧼯 → 𧼯 ) CJK COMPATIBILITY IDEOGRAPH-2F9D8 → CJK UNIFIED IDEOGRAPH-27F2F # + +2F9C ; 8DB3 ; MA #* ( ⾜ → 足 ) KANGXI RADICAL FOOT → CJK UNIFIED IDEOGRAPH-8DB3 # + +2F9DA ; 8DCB ; MA # ( 跋 → 跋 ) CJK COMPATIBILITY IDEOGRAPH-2F9DA → CJK UNIFIED IDEOGRAPH-8DCB # + +2F9DB ; 8DBC ; MA # ( 趼 → 趼 ) CJK COMPATIBILITY IDEOGRAPH-2F9DB → CJK UNIFIED IDEOGRAPH-8DBC # + +8DFA ; 8DE5 ; MA # ( 跺 → 跥 ) CJK UNIFIED IDEOGRAPH-8DFA → CJK UNIFIED IDEOGRAPH-8DE5 # + +F937 ; 8DEF ; MA # ( 路 → 路 ) CJK COMPATIBILITY IDEOGRAPH-F937 → CJK UNIFIED IDEOGRAPH-8DEF # + +2F9DC ; 8DF0 ; MA # ( 跰 → 跰 ) CJK COMPATIBILITY IDEOGRAPH-2F9DC → CJK UNIFIED IDEOGRAPH-8DF0 # + +8E9B ; 8E97 ; MA # ( 躛 → 躗 ) CJK UNIFIED IDEOGRAPH-8E9B → CJK UNIFIED IDEOGRAPH-8E97 # + +2F9D ; 8EAB ; MA #* ( ⾝ → 身 ) KANGXI RADICAL BODY → CJK UNIFIED IDEOGRAPH-8EAB # + +F902 ; 8ECA ; MA # ( 車 → 車 ) CJK COMPATIBILITY IDEOGRAPH-F902 → CJK UNIFIED IDEOGRAPH-8ECA # +2F9E ; 8ECA ; MA #* ( ⾞ → 車 ) KANGXI RADICAL CART → CJK UNIFIED IDEOGRAPH-8ECA # + +2ECB ; 8F66 ; MA #* ( ⻋ → 车 ) CJK RADICAL C-SIMPLIFIED CART → CJK UNIFIED IDEOGRAPH-8F66 # + +2F9DE ; 8ED4 ; MA # ( 軔 → 軔 ) CJK COMPATIBILITY IDEOGRAPH-2F9DE → CJK UNIFIED IDEOGRAPH-8ED4 # + +8F27 ; 8EFF ; MA # ( 輧 → 軿 ) CJK UNIFIED IDEOGRAPH-8F27 → CJK UNIFIED IDEOGRAPH-8EFF # + +F998 ; 8F26 ; MA # ( 輦 → 輦 ) CJK COMPATIBILITY IDEOGRAPH-F998 → CJK UNIFIED IDEOGRAPH-8F26 # + +F9D7 ; 8F2A ; MA # ( 輪 → 輪 ) CJK COMPATIBILITY IDEOGRAPH-F9D7 → CJK UNIFIED IDEOGRAPH-8F2A # + +FAC2 ; 8F38 ; MA # ( 輸 → 輸 ) CJK COMPATIBILITY IDEOGRAPH-FAC2 → CJK UNIFIED IDEOGRAPH-8F38 # +2F9DF ; 8F38 ; MA # ( 輸 → 輸 ) CJK COMPATIBILITY IDEOGRAPH-2F9DF → CJK UNIFIED IDEOGRAPH-8F38 # + +FA07 ; 8F3B ; MA # ( 輻 → 輻 ) CJK COMPATIBILITY IDEOGRAPH-FA07 → CJK UNIFIED IDEOGRAPH-8F3B # + +F98D ; 8F62 ; MA # ( 轢 → 轢 ) CJK COMPATIBILITY IDEOGRAPH-F98D → CJK UNIFIED IDEOGRAPH-8F62 # + +2F9F ; 8F9B ; MA #* ( ⾟ → 辛 ) KANGXI RADICAL BITTER → CJK UNIFIED IDEOGRAPH-8F9B # + +2F98D ; 8F9E ; MA # ( 辞 → 辞 ) CJK COMPATIBILITY IDEOGRAPH-2F98D → CJK UNIFIED IDEOGRAPH-8F9E # + +F971 ; 8FB0 ; MA # ( 辰 → 辰 ) CJK COMPATIBILITY IDEOGRAPH-F971 → CJK UNIFIED IDEOGRAPH-8FB0 # +2FA0 ; 8FB0 ; MA #* ( ⾠ → 辰 ) KANGXI RADICAL MORNING → CJK UNIFIED IDEOGRAPH-8FB0 # + +2FA1 ; 8FB5 ; MA #* ( ⾡ → 辵 ) KANGXI RADICAL WALK → CJK UNIFIED IDEOGRAPH-8FB5 # + +FA66 ; 8FB6 ; MA # ( 辶 → 辶 ) CJK COMPATIBILITY IDEOGRAPH-FA66 → CJK UNIFIED IDEOGRAPH-8FB6 # +2ECC ; 8FB6 ; MA #* ( ⻌ → 辶 ) CJK RADICAL SIMPLIFIED WALK → CJK UNIFIED IDEOGRAPH-8FB6 # +2ECD ; 8FB6 ; MA #* ( ⻍ → 辶 ) CJK RADICAL WALK ONE → CJK UNIFIED IDEOGRAPH-8FB6 # + +2F881 ; 5DE1 ; MA # ( 巡 → 巡 ) CJK COMPATIBILITY IDEOGRAPH-2F881 → CJK UNIFIED IDEOGRAPH-5DE1 # + +F99A ; 9023 ; MA # ( 連 → 連 ) CJK COMPATIBILITY IDEOGRAPH-F99A → CJK UNIFIED IDEOGRAPH-9023 # + +FA25 ; 9038 ; MA # ( 逸 → 逸 ) CJK COMPATIBILITY IDEOGRAPH-FA25 → CJK UNIFIED IDEOGRAPH-9038 # +FA67 ; 9038 ; MA # ( 逸 → 逸 ) CJK COMPATIBILITY IDEOGRAPH-FA67 → CJK UNIFIED IDEOGRAPH-9038 # + +FAC3 ; 9072 ; MA # ( 遲 → 遲 ) CJK COMPATIBILITY IDEOGRAPH-FAC3 → CJK UNIFIED IDEOGRAPH-9072 # + +F9C3 ; 907C ; MA # ( 遼 → 遼 ) CJK COMPATIBILITY IDEOGRAPH-F9C3 → CJK UNIFIED IDEOGRAPH-907C # + +2F9E0 ; 285D2 ; MA # ( 𨗒 → 𨗒 ) CJK COMPATIBILITY IDEOGRAPH-2F9E0 → CJK UNIFIED IDEOGRAPH-285D2 # + +2F9E1 ; 285ED ; MA # ( 𨗭 → 𨗭 ) CJK COMPATIBILITY IDEOGRAPH-2F9E1 → CJK UNIFIED IDEOGRAPH-285ED # + +F913 ; 908F ; MA # ( 邏 → 邏 ) CJK COMPATIBILITY IDEOGRAPH-F913 → CJK UNIFIED IDEOGRAPH-908F # + +2FA2 ; 9091 ; MA #* ( ⾢ → 邑 ) KANGXI RADICAL CITY → CJK UNIFIED IDEOGRAPH-9091 # + +2F9E2 ; 9094 ; MA # ( 邔 → 邔 ) CJK COMPATIBILITY IDEOGRAPH-2F9E2 → CJK UNIFIED IDEOGRAPH-9094 # + +F92C ; 90CE ; MA # ( 郎 → 郎 ) CJK COMPATIBILITY IDEOGRAPH-F92C → CJK UNIFIED IDEOGRAPH-90CE # +90DE ; 90CE ; MA # ( 郞 → 郎 ) CJK UNIFIED IDEOGRAPH-90DE → CJK UNIFIED IDEOGRAPH-90CE # →郎→ +FA2E ; 90CE ; MA # ( 郞 → 郎 ) CJK COMPATIBILITY IDEOGRAPH-FA2E → CJK UNIFIED IDEOGRAPH-90CE # →郞→→郎→ + +2F9E3 ; 90F1 ; MA # ( 郱 → 郱 ) CJK COMPATIBILITY IDEOGRAPH-2F9E3 → CJK UNIFIED IDEOGRAPH-90F1 # + +FA26 ; 90FD ; MA # ( 都 → 都 ) CJK COMPATIBILITY IDEOGRAPH-FA26 → CJK UNIFIED IDEOGRAPH-90FD # + +2F9E5 ; 2872E ; MA # ( 𨜮 → 𨜮 ) CJK COMPATIBILITY IDEOGRAPH-2F9E5 → CJK UNIFIED IDEOGRAPH-2872E # + +2F9E4 ; 9111 ; MA # ( 鄑 → 鄑 ) CJK COMPATIBILITY IDEOGRAPH-2F9E4 → CJK UNIFIED IDEOGRAPH-9111 # + +2F9E6 ; 911B ; MA # ( 鄛 → 鄛 ) CJK COMPATIBILITY IDEOGRAPH-2F9E6 → CJK UNIFIED IDEOGRAPH-911B # + +2FA3 ; 9149 ; MA #* ( ⾣ → 酉 ) KANGXI RADICAL WINE → CJK UNIFIED IDEOGRAPH-9149 # + +F919 ; 916A ; MA # ( 酪 → 酪 ) CJK COMPATIBILITY IDEOGRAPH-F919 → CJK UNIFIED IDEOGRAPH-916A # + +FAC4 ; 9199 ; MA # ( 醙 → 醙 ) CJK COMPATIBILITY IDEOGRAPH-FAC4 → CJK UNIFIED IDEOGRAPH-9199 # + +F9B7 ; 91B4 ; MA # ( 醴 → 醴 ) CJK COMPATIBILITY IDEOGRAPH-F9B7 → CJK UNIFIED IDEOGRAPH-91B4 # + +2FA4 ; 91C6 ; MA #* ( ⾤ → 釆 ) KANGXI RADICAL DISTINGUISH → CJK UNIFIED IDEOGRAPH-91C6 # + +F9E9 ; 91CC ; MA # ( 里 → 里 ) CJK COMPATIBILITY IDEOGRAPH-F9E9 → CJK UNIFIED IDEOGRAPH-91CC # +2FA5 ; 91CC ; MA #* ( ⾥ → 里 ) KANGXI RADICAL VILLAGE → CJK UNIFIED IDEOGRAPH-91CC # + +F97E ; 91CF ; MA # ( 量 → 量 ) CJK COMPATIBILITY IDEOGRAPH-F97E → CJK UNIFIED IDEOGRAPH-91CF # + +F90A ; 91D1 ; MA # ( 金 → 金 ) CJK COMPATIBILITY IDEOGRAPH-F90A → CJK UNIFIED IDEOGRAPH-91D1 # +2FA6 ; 91D1 ; MA #* ( ⾦ → 金 ) KANGXI RADICAL GOLD → CJK UNIFIED IDEOGRAPH-91D1 # + +2ED0 ; 9485 ; MA #* ( ⻐ → 钅 ) CJK RADICAL C-SIMPLIFIED GOLD → CJK UNIFIED IDEOGRAPH-9485 # + +F9B1 ; 9234 ; MA # ( 鈴 → 鈴 ) CJK COMPATIBILITY IDEOGRAPH-F9B1 → CJK UNIFIED IDEOGRAPH-9234 # + +2F9E7 ; 9238 ; MA # ( 鈸 → 鈸 ) CJK COMPATIBILITY IDEOGRAPH-2F9E7 → CJK UNIFIED IDEOGRAPH-9238 # + +FAC5 ; 9276 ; MA # ( 鉶 → 鉶 ) CJK COMPATIBILITY IDEOGRAPH-FAC5 → CJK UNIFIED IDEOGRAPH-9276 # + +2F9E8 ; 92D7 ; MA # ( 鋗 → 鋗 ) CJK COMPATIBILITY IDEOGRAPH-2F9E8 → CJK UNIFIED IDEOGRAPH-92D7 # + +2F9E9 ; 92D8 ; MA # ( 鋘 → 鋘 ) CJK COMPATIBILITY IDEOGRAPH-2F9E9 → CJK UNIFIED IDEOGRAPH-92D8 # + +2F9EA ; 927C ; MA # ( 鉼 → 鉼 ) CJK COMPATIBILITY IDEOGRAPH-2F9EA → CJK UNIFIED IDEOGRAPH-927C # + +F93F ; 9304 ; MA # ( 錄 → 錄 ) CJK COMPATIBILITY IDEOGRAPH-F93F → CJK UNIFIED IDEOGRAPH-9304 # + +F99B ; 934A ; MA # ( 鍊 → 鍊 ) CJK COMPATIBILITY IDEOGRAPH-F99B → CJK UNIFIED IDEOGRAPH-934A # + +93AE ; 93AD ; MA # ( 鎮 → 鎭 ) CJK UNIFIED IDEOGRAPH-93AE → CJK UNIFIED IDEOGRAPH-93AD # + +2F9EB ; 93F9 ; MA # ( 鏹 → 鏹 ) CJK COMPATIBILITY IDEOGRAPH-2F9EB → CJK UNIFIED IDEOGRAPH-93F9 # + +2F9EC ; 9415 ; MA # ( 鐕 → 鐕 ) CJK COMPATIBILITY IDEOGRAPH-2F9EC → CJK UNIFIED IDEOGRAPH-9415 # + +2F9ED ; 28BFA ; MA # ( 𨯺 → 𨯺 ) CJK COMPATIBILITY IDEOGRAPH-2F9ED → CJK UNIFIED IDEOGRAPH-28BFA # + +2ED2 ; 9578 ; MA #* ( ⻒ → 镸 ) CJK RADICAL LONG TWO → CJK UNIFIED IDEOGRAPH-9578 # + +2ED3 ; 957F ; MA #* ( ⻓ → 长 ) CJK RADICAL C-SIMPLIFIED LONG → CJK UNIFIED IDEOGRAPH-957F # + +2FA8 ; 9580 ; MA #* ( ⾨ → 門 ) KANGXI RADICAL GATE → CJK UNIFIED IDEOGRAPH-9580 # + +2ED4 ; 95E8 ; MA #* ( ⻔ → 门 ) CJK RADICAL C-SIMPLIFIED GATE → CJK UNIFIED IDEOGRAPH-95E8 # + +2F9EE ; 958B ; MA # ( 開 → 開 ) CJK COMPATIBILITY IDEOGRAPH-2F9EE → CJK UNIFIED IDEOGRAPH-958B # + +2F9EF ; 4995 ; MA # ( 䦕 → 䦕 ) CJK COMPATIBILITY IDEOGRAPH-2F9EF → CJK UNIFIED IDEOGRAPH-4995 # + +F986 ; 95AD ; MA # ( 閭 → 閭 ) CJK COMPATIBILITY IDEOGRAPH-F986 → CJK UNIFIED IDEOGRAPH-95AD # + +2F9F0 ; 95B7 ; MA # ( 閷 → 閷 ) CJK COMPATIBILITY IDEOGRAPH-2F9F0 → CJK UNIFIED IDEOGRAPH-95B7 # + +2F9F1 ; 28D77 ; MA # ( 𨵷 → 𨵷 ) CJK COMPATIBILITY IDEOGRAPH-2F9F1 → CJK UNIFIED IDEOGRAPH-28D77 # + +2FA9 ; 961C ; MA #* ( ⾩ → 阜 ) KANGXI RADICAL MOUND → CJK UNIFIED IDEOGRAPH-961C # + +2ECF ; 961D ; MA #* ( ⻏ → 阝 ) CJK RADICAL CITY → CJK UNIFIED IDEOGRAPH-961D # +2ED6 ; 961D ; MA #* ( ⻖ → 阝 ) CJK RADICAL MOUND TWO → CJK UNIFIED IDEOGRAPH-961D # + +F9C6 ; 962E ; MA # ( 阮 → 阮 ) CJK COMPATIBILITY IDEOGRAPH-F9C6 → CJK UNIFIED IDEOGRAPH-962E # + +F951 ; 964B ; MA # ( 陋 → 陋 ) CJK COMPATIBILITY IDEOGRAPH-F951 → CJK UNIFIED IDEOGRAPH-964B # + +FA09 ; 964D ; MA # ( 降 → 降 ) CJK COMPATIBILITY IDEOGRAPH-FA09 → CJK UNIFIED IDEOGRAPH-964D # + +F959 ; 9675 ; MA # ( 陵 → 陵 ) CJK COMPATIBILITY IDEOGRAPH-F959 → CJK UNIFIED IDEOGRAPH-9675 # + +F9D3 ; 9678 ; MA # ( 陸 → 陸 ) CJK COMPATIBILITY IDEOGRAPH-F9D3 → CJK UNIFIED IDEOGRAPH-9678 # + +FAC6 ; 967C ; MA # ( 陼 → 陼 ) CJK COMPATIBILITY IDEOGRAPH-FAC6 → CJK UNIFIED IDEOGRAPH-967C # + +F9DC ; 9686 ; MA # ( 隆 → 隆 ) CJK COMPATIBILITY IDEOGRAPH-F9DC → CJK UNIFIED IDEOGRAPH-9686 # + +F9F1 ; 96A3 ; MA # ( 隣 → 隣 ) CJK COMPATIBILITY IDEOGRAPH-F9F1 → CJK UNIFIED IDEOGRAPH-96A3 # + +2F9F2 ; 49E6 ; MA # ( 䧦 → 䧦 ) CJK COMPATIBILITY IDEOGRAPH-2F9F2 → CJK UNIFIED IDEOGRAPH-49E6 # + +2FAA ; 96B6 ; MA #* ( ⾪ → 隶 ) KANGXI RADICAL SLAVE → CJK UNIFIED IDEOGRAPH-96B6 # + +FA2F ; 96B7 ; MA # ( 隷 → 隷 ) CJK COMPATIBILITY IDEOGRAPH-FA2F → CJK UNIFIED IDEOGRAPH-96B7 # +96B8 ; 96B7 ; MA # ( 隸 → 隷 ) CJK UNIFIED IDEOGRAPH-96B8 → CJK UNIFIED IDEOGRAPH-96B7 # →隸→ +F9B8 ; 96B7 ; MA # ( 隸 → 隷 ) CJK COMPATIBILITY IDEOGRAPH-F9B8 → CJK UNIFIED IDEOGRAPH-96B7 # + +2FAB ; 96B9 ; MA #* ( ⾫ → 隹 ) KANGXI RADICAL SHORT TAILED BIRD → CJK UNIFIED IDEOGRAPH-96B9 # + +2F9F3 ; 96C3 ; MA # ( 雃 → 雃 ) CJK COMPATIBILITY IDEOGRAPH-2F9F3 → CJK UNIFIED IDEOGRAPH-96C3 # + +F9EA ; 96E2 ; MA # ( 離 → 離 ) CJK COMPATIBILITY IDEOGRAPH-F9EA → CJK UNIFIED IDEOGRAPH-96E2 # + +FA68 ; 96E3 ; MA # ( 難 → 難 ) CJK COMPATIBILITY IDEOGRAPH-FA68 → CJK UNIFIED IDEOGRAPH-96E3 # +FAC7 ; 96E3 ; MA # ( 難 → 難 ) CJK COMPATIBILITY IDEOGRAPH-FAC7 → CJK UNIFIED IDEOGRAPH-96E3 # + +2FAC ; 96E8 ; MA #* ( ⾬ → 雨 ) KANGXI RADICAL RAIN → CJK UNIFIED IDEOGRAPH-96E8 # + +F9B2 ; 96F6 ; MA # ( 零 → 零 ) CJK COMPATIBILITY IDEOGRAPH-F9B2 → CJK UNIFIED IDEOGRAPH-96F6 # + +F949 ; 96F7 ; MA # ( 雷 → 雷 ) CJK COMPATIBILITY IDEOGRAPH-F949 → CJK UNIFIED IDEOGRAPH-96F7 # + +2F9F5 ; 9723 ; MA # ( 霣 → 霣 ) CJK COMPATIBILITY IDEOGRAPH-2F9F5 → CJK UNIFIED IDEOGRAPH-9723 # + +2F9F6 ; 29145 ; MA # ( 𩅅 → 𩅅 ) CJK COMPATIBILITY IDEOGRAPH-2F9F6 → CJK UNIFIED IDEOGRAPH-29145 # + +F938 ; 9732 ; MA # ( 露 → 露 ) CJK COMPATIBILITY IDEOGRAPH-F938 → CJK UNIFIED IDEOGRAPH-9732 # + +F9B3 ; 9748 ; MA # ( 靈 → 靈 ) CJK COMPATIBILITY IDEOGRAPH-F9B3 → CJK UNIFIED IDEOGRAPH-9748 # + +2FAD ; 9751 ; MA #* ( ⾭ → 靑 ) KANGXI RADICAL BLUE → CJK UNIFIED IDEOGRAPH-9751 # + +2ED8 ; 9752 ; MA #* ( ⻘ → 青 ) CJK RADICAL BLUE → CJK UNIFIED IDEOGRAPH-9752 # + +FA1C ; 9756 ; MA # ( 靖 → 靖 ) CJK COMPATIBILITY IDEOGRAPH-FA1C → CJK UNIFIED IDEOGRAPH-9756 # +FAC8 ; 9756 ; MA # ( 靖 → 靖 ) CJK COMPATIBILITY IDEOGRAPH-FAC8 → CJK UNIFIED IDEOGRAPH-9756 # + +2F81C ; 291DF ; MA # ( 𩇟 → 𩇟 ) CJK COMPATIBILITY IDEOGRAPH-2F81C → CJK UNIFIED IDEOGRAPH-291DF # + +2FAE ; 975E ; MA #* ( ⾮ → 非 ) KANGXI RADICAL WRONG → CJK UNIFIED IDEOGRAPH-975E # + +2FAF ; 9762 ; MA #* ( ⾯ → 面 ) KANGXI RADICAL FACE → CJK UNIFIED IDEOGRAPH-9762 # + +2F9F7 ; 2921A ; MA # ( 𩈚 → 𩈚 ) CJK COMPATIBILITY IDEOGRAPH-2F9F7 → CJK UNIFIED IDEOGRAPH-2921A # + +2FB0 ; 9769 ; MA #* ( ⾰ → 革 ) KANGXI RADICAL LEATHER → CJK UNIFIED IDEOGRAPH-9769 # + +2F9F8 ; 4A6E ; MA # ( 䩮 → 䩮 ) CJK COMPATIBILITY IDEOGRAPH-2F9F8 → CJK UNIFIED IDEOGRAPH-4A6E # + +2F9F9 ; 4A76 ; MA # ( 䩶 → 䩶 ) CJK COMPATIBILITY IDEOGRAPH-2F9F9 → CJK UNIFIED IDEOGRAPH-4A76 # + +2FB1 ; 97CB ; MA #* ( ⾱ → 韋 ) KANGXI RADICAL TANNED LEATHER → CJK UNIFIED IDEOGRAPH-97CB # + +2ED9 ; 97E6 ; MA #* ( ⻙ → 韦 ) CJK RADICAL C-SIMPLIFIED TANNED LEATHER → CJK UNIFIED IDEOGRAPH-97E6 # + +FAC9 ; 97DB ; MA # ( 韛 → 韛 ) CJK COMPATIBILITY IDEOGRAPH-FAC9 → CJK UNIFIED IDEOGRAPH-97DB # + +2F9FA ; 97E0 ; MA # ( 韠 → 韠 ) CJK COMPATIBILITY IDEOGRAPH-2F9FA → CJK UNIFIED IDEOGRAPH-97E0 # + +2FB2 ; 97ED ; MA #* ( ⾲ → 韭 ) KANGXI RADICAL LEEK → CJK UNIFIED IDEOGRAPH-97ED # + +2F9FB ; 2940A ; MA # ( 𩐊 → 𩐊 ) CJK COMPATIBILITY IDEOGRAPH-2F9FB → CJK UNIFIED IDEOGRAPH-2940A # + +2FB3 ; 97F3 ; MA #* ( ⾳ → 音 ) KANGXI RADICAL SOUND → CJK UNIFIED IDEOGRAPH-97F3 # + +FA69 ; 97FF ; MA # ( 響 → 響 ) CJK COMPATIBILITY IDEOGRAPH-FA69 → CJK UNIFIED IDEOGRAPH-97FF # +FACA ; 97FF ; MA # ( 響 → 響 ) CJK COMPATIBILITY IDEOGRAPH-FACA → CJK UNIFIED IDEOGRAPH-97FF # + +2FB4 ; 9801 ; MA #* ( ⾴ → 頁 ) KANGXI RADICAL LEAF → CJK UNIFIED IDEOGRAPH-9801 # + +2EDA ; 9875 ; MA #* ( ⻚ → 页 ) CJK RADICAL C-SIMPLIFIED LEAF → CJK UNIFIED IDEOGRAPH-9875 # + +2F9FC ; 4AB2 ; MA # ( 䪲 → 䪲 ) CJK COMPATIBILITY IDEOGRAPH-2F9FC → CJK UNIFIED IDEOGRAPH-4AB2 # + +FACB ; 980B ; MA # ( 頋 → 頋 ) CJK COMPATIBILITY IDEOGRAPH-FACB → CJK UNIFIED IDEOGRAPH-980B # +2F9FE ; 980B ; MA # ( 頋 → 頋 ) CJK COMPATIBILITY IDEOGRAPH-2F9FE → CJK UNIFIED IDEOGRAPH-980B # +2F9FF ; 980B ; MA # ( 頋 → 頋 ) CJK COMPATIBILITY IDEOGRAPH-2F9FF → CJK UNIFIED IDEOGRAPH-980B # + +F9B4 ; 9818 ; MA # ( 領 → 領 ) CJK COMPATIBILITY IDEOGRAPH-F9B4 → CJK UNIFIED IDEOGRAPH-9818 # + +2FA00 ; 9829 ; MA # ( 頩 → 頩 ) CJK COMPATIBILITY IDEOGRAPH-2FA00 → CJK UNIFIED IDEOGRAPH-9829 # + +2F9FD ; 29496 ; MA # ( 𩒖 → 𩒖 ) CJK COMPATIBILITY IDEOGRAPH-2F9FD → CJK UNIFIED IDEOGRAPH-29496 # + +FA6A ; 983B ; MA # ( 頻 → 頻 ) CJK COMPATIBILITY IDEOGRAPH-FA6A → CJK UNIFIED IDEOGRAPH-983B # +FACC ; 983B ; MA # ( 頻 → 頻 ) CJK COMPATIBILITY IDEOGRAPH-FACC → CJK UNIFIED IDEOGRAPH-983B # + +F9D0 ; 985E ; MA # ( 類 → 類 ) CJK COMPATIBILITY IDEOGRAPH-F9D0 → CJK UNIFIED IDEOGRAPH-985E # + +2FB5 ; 98A8 ; MA #* ( ⾵ → 風 ) KANGXI RADICAL WIND → CJK UNIFIED IDEOGRAPH-98A8 # + +2EDB ; 98CE ; MA #* ( ⻛ → 风 ) CJK RADICAL C-SIMPLIFIED WIND → CJK UNIFIED IDEOGRAPH-98CE # + +2FA01 ; 295B6 ; MA # ( 𩖶 → 𩖶 ) CJK COMPATIBILITY IDEOGRAPH-2FA01 → CJK UNIFIED IDEOGRAPH-295B6 # + +2FB6 ; 98DB ; MA #* ( ⾶ → 飛 ) KANGXI RADICAL FLY → CJK UNIFIED IDEOGRAPH-98DB # + +2EDC ; 98DE ; MA #* ( ⻜ → 飞 ) CJK RADICAL C-SIMPLIFIED FLY → CJK UNIFIED IDEOGRAPH-98DE # + +2EDD ; 98DF ; MA #* ( ⻝ → 食 ) CJK RADICAL EAT ONE → CJK UNIFIED IDEOGRAPH-98DF # +2FB7 ; 98DF ; MA #* ( ⾷ → 食 ) KANGXI RADICAL EAT → CJK UNIFIED IDEOGRAPH-98DF # + +2EDF ; 98E0 ; MA #* ( ⻟ → 飠 ) CJK RADICAL EAT THREE → CJK UNIFIED IDEOGRAPH-98E0 # + +2EE0 ; 9963 ; MA #* ( ⻠ → 饣 ) CJK RADICAL C-SIMPLIFIED EAT → CJK UNIFIED IDEOGRAPH-9963 # + +2FA02 ; 98E2 ; MA # ( 飢 → 飢 ) CJK COMPATIBILITY IDEOGRAPH-2FA02 → CJK UNIFIED IDEOGRAPH-98E2 # + +FA2A ; 98EF ; MA # ( 飯 → 飯 ) CJK COMPATIBILITY IDEOGRAPH-FA2A → CJK UNIFIED IDEOGRAPH-98EF # + +FA2B ; 98FC ; MA # ( 飼 → 飼 ) CJK COMPATIBILITY IDEOGRAPH-FA2B → CJK UNIFIED IDEOGRAPH-98FC # + +2FA03 ; 4B33 ; MA # ( 䬳 → 䬳 ) CJK COMPATIBILITY IDEOGRAPH-2FA03 → CJK UNIFIED IDEOGRAPH-4B33 # + +FA2C ; 9928 ; MA # ( 館 → 館 ) CJK COMPATIBILITY IDEOGRAPH-FA2C → CJK UNIFIED IDEOGRAPH-9928 # + +2FA04 ; 9929 ; MA # ( 餩 → 餩 ) CJK COMPATIBILITY IDEOGRAPH-2FA04 → CJK UNIFIED IDEOGRAPH-9929 # + +2FB8 ; 9996 ; MA #* ( ⾸ → 首 ) KANGXI RADICAL HEAD → CJK UNIFIED IDEOGRAPH-9996 # + +2FB9 ; 9999 ; MA #* ( ⾹ → 香 ) KANGXI RADICAL FRAGRANT → CJK UNIFIED IDEOGRAPH-9999 # + +2FA05 ; 99A7 ; MA # ( 馧 → 馧 ) CJK COMPATIBILITY IDEOGRAPH-2FA05 → CJK UNIFIED IDEOGRAPH-99A7 # + +2FBA ; 99AC ; MA #* ( ⾺ → 馬 ) KANGXI RADICAL HORSE → CJK UNIFIED IDEOGRAPH-99AC # + +2EE2 ; 9A6C ; MA #* ( ⻢ → 马 ) CJK RADICAL C-SIMPLIFIED HORSE → CJK UNIFIED IDEOGRAPH-9A6C # + +2FA06 ; 99C2 ; MA # ( 駂 → 駂 ) CJK COMPATIBILITY IDEOGRAPH-2FA06 → CJK UNIFIED IDEOGRAPH-99C2 # + +F91A ; 99F1 ; MA # ( 駱 → 駱 ) CJK COMPATIBILITY IDEOGRAPH-F91A → CJK UNIFIED IDEOGRAPH-99F1 # + +2FA07 ; 99FE ; MA # ( 駾 → 駾 ) CJK COMPATIBILITY IDEOGRAPH-2FA07 → CJK UNIFIED IDEOGRAPH-99FE # + +F987 ; 9A6A ; MA # ( 驪 → 驪 ) CJK COMPATIBILITY IDEOGRAPH-F987 → CJK UNIFIED IDEOGRAPH-9A6A # + +2FBB ; 9AA8 ; MA #* ( ⾻ → 骨 ) KANGXI RADICAL BONE → CJK UNIFIED IDEOGRAPH-9AA8 # + +2FA08 ; 4BCE ; MA # ( 䯎 → 䯎 ) CJK COMPATIBILITY IDEOGRAPH-2FA08 → CJK UNIFIED IDEOGRAPH-4BCE # + +2FBC ; 9AD8 ; MA #* ( ⾼ → 高 ) KANGXI RADICAL TALL → CJK UNIFIED IDEOGRAPH-9AD8 # + +2FBD ; 9ADF ; MA #* ( ⾽ → 髟 ) KANGXI RADICAL HAIR → CJK UNIFIED IDEOGRAPH-9ADF # + +2FA09 ; 29B30 ; MA # ( 𩬰 → 𩬰 ) CJK COMPATIBILITY IDEOGRAPH-2FA09 → CJK UNIFIED IDEOGRAPH-29B30 # + +FACD ; 9B12 ; MA # ( 鬒 → 鬒 ) CJK COMPATIBILITY IDEOGRAPH-FACD → CJK UNIFIED IDEOGRAPH-9B12 # +2FA0A ; 9B12 ; MA # ( 鬒 → 鬒 ) CJK COMPATIBILITY IDEOGRAPH-2FA0A → CJK UNIFIED IDEOGRAPH-9B12 # + +2FBE ; 9B25 ; MA #* ( ⾾ → 鬥 ) KANGXI RADICAL FIGHT → CJK UNIFIED IDEOGRAPH-9B25 # + +2FBF ; 9B2F ; MA #* ( ⾿ → 鬯 ) KANGXI RADICAL SACRIFICIAL WINE → CJK UNIFIED IDEOGRAPH-9B2F # + +2FC0 ; 9B32 ; MA #* ( ⿀ → 鬲 ) KANGXI RADICAL CAULDRON → CJK UNIFIED IDEOGRAPH-9B32 # + +2FC1 ; 9B3C ; MA #* ( ⿁ → 鬼 ) KANGXI RADICAL GHOST → CJK UNIFIED IDEOGRAPH-9B3C # +2EE4 ; 9B3C ; MA #* ( ⻤ → 鬼 ) CJK RADICAL GHOST → CJK UNIFIED IDEOGRAPH-9B3C # + +2FC2 ; 9B5A ; MA #* ( ⿂ → 魚 ) KANGXI RADICAL FISH → CJK UNIFIED IDEOGRAPH-9B5A # + +2EE5 ; 9C7C ; MA #* ( ⻥ → 鱼 ) CJK RADICAL C-SIMPLIFIED FISH → CJK UNIFIED IDEOGRAPH-9C7C # + +F939 ; 9B6F ; MA # ( 魯 → 魯 ) CJK COMPATIBILITY IDEOGRAPH-F939 → CJK UNIFIED IDEOGRAPH-9B6F # + +2FA0B ; 9C40 ; MA # ( 鱀 → 鱀 ) CJK COMPATIBILITY IDEOGRAPH-2FA0B → CJK UNIFIED IDEOGRAPH-9C40 # + +F9F2 ; 9C57 ; MA # ( 鱗 → 鱗 ) CJK COMPATIBILITY IDEOGRAPH-F9F2 → CJK UNIFIED IDEOGRAPH-9C57 # + +2FC3 ; 9CE5 ; MA #* ( ⿃ → 鳥 ) KANGXI RADICAL BIRD → CJK UNIFIED IDEOGRAPH-9CE5 # + +2FA0C ; 9CFD ; MA # ( 鳽 → 鳽 ) CJK COMPATIBILITY IDEOGRAPH-2FA0C → CJK UNIFIED IDEOGRAPH-9CFD # + +2FA0D ; 4CCE ; MA # ( 䳎 → 䳎 ) CJK COMPATIBILITY IDEOGRAPH-2FA0D → CJK UNIFIED IDEOGRAPH-4CCE # + +9E43 ; 9E42 ; MA # ( 鹃 → 鹂 ) CJK UNIFIED IDEOGRAPH-9E43 → CJK UNIFIED IDEOGRAPH-9E42 # + +2FA0F ; 9D67 ; MA # ( 鵧 → 鵧 ) CJK COMPATIBILITY IDEOGRAPH-2FA0F → CJK UNIFIED IDEOGRAPH-9D67 # + +2FA0E ; 4CED ; MA # ( 䳭 → 䳭 ) CJK COMPATIBILITY IDEOGRAPH-2FA0E → CJK UNIFIED IDEOGRAPH-4CED # + +2FA10 ; 2A0CE ; MA # ( 𪃎 → 𪃎 ) CJK COMPATIBILITY IDEOGRAPH-2FA10 → CJK UNIFIED IDEOGRAPH-2A0CE # + +FA2D ; 9DB4 ; MA # ( 鶴 → 鶴 ) CJK COMPATIBILITY IDEOGRAPH-FA2D → CJK UNIFIED IDEOGRAPH-9DB4 # + +2FA12 ; 2A105 ; MA # ( 𪄅 → 𪄅 ) CJK COMPATIBILITY IDEOGRAPH-2FA12 → CJK UNIFIED IDEOGRAPH-2A105 # + +2FA11 ; 4CF8 ; MA # ( 䳸 → 䳸 ) CJK COMPATIBILITY IDEOGRAPH-2FA11 → CJK UNIFIED IDEOGRAPH-4CF8 # + +F93A ; 9DFA ; MA # ( 鷺 → 鷺 ) CJK COMPATIBILITY IDEOGRAPH-F93A → CJK UNIFIED IDEOGRAPH-9DFA # + +2FA13 ; 2A20E ; MA # ( 𪈎 → 𪈎 ) CJK COMPATIBILITY IDEOGRAPH-2FA13 → CJK UNIFIED IDEOGRAPH-2A20E # + +F920 ; 9E1E ; MA # ( 鸞 → 鸞 ) CJK COMPATIBILITY IDEOGRAPH-F920 → CJK UNIFIED IDEOGRAPH-9E1E # + +2FC4 ; 9E75 ; MA #* ( ⿄ → 鹵 ) KANGXI RADICAL SALT → CJK UNIFIED IDEOGRAPH-9E75 # + +F940 ; 9E7F ; MA # ( 鹿 → 鹿 ) CJK COMPATIBILITY IDEOGRAPH-F940 → CJK UNIFIED IDEOGRAPH-9E7F # +2FC5 ; 9E7F ; MA #* ( ⿅ → 鹿 ) KANGXI RADICAL DEER → CJK UNIFIED IDEOGRAPH-9E7F # + +2FA14 ; 2A291 ; MA # ( 𪊑 → 𪊑 ) CJK COMPATIBILITY IDEOGRAPH-2FA14 → CJK UNIFIED IDEOGRAPH-2A291 # + +F988 ; 9E97 ; MA # ( 麗 → 麗 ) CJK COMPATIBILITY IDEOGRAPH-F988 → CJK UNIFIED IDEOGRAPH-9E97 # + +F9F3 ; 9E9F ; MA # ( 麟 → 麟 ) CJK COMPATIBILITY IDEOGRAPH-F9F3 → CJK UNIFIED IDEOGRAPH-9E9F # + +2FC6 ; 9EA5 ; MA #* ( ⿆ → 麥 ) KANGXI RADICAL WHEAT → CJK UNIFIED IDEOGRAPH-9EA5 # + +2EE8 ; 9EA6 ; MA #* ( ⻨ → 麦 ) CJK RADICAL SIMPLIFIED WHEAT → CJK UNIFIED IDEOGRAPH-9EA6 # + +2FA15 ; 9EBB ; MA # ( 麻 → 麻 ) CJK COMPATIBILITY IDEOGRAPH-2FA15 → CJK UNIFIED IDEOGRAPH-9EBB # +2FC7 ; 9EBB ; MA #* ( ⿇ → 麻 ) KANGXI RADICAL HEMP → CJK UNIFIED IDEOGRAPH-9EBB # + +2F88F ; 2A392 ; MA # ( 𪎒 → 𪎒 ) CJK COMPATIBILITY IDEOGRAPH-2F88F → CJK UNIFIED IDEOGRAPH-2A392 # + +2FC8 ; 9EC3 ; MA #* ( ⿈ → 黃 ) KANGXI RADICAL YELLOW → CJK UNIFIED IDEOGRAPH-9EC3 # + +2EE9 ; 9EC4 ; MA #* ( ⻩ → 黄 ) CJK RADICAL SIMPLIFIED YELLOW → CJK UNIFIED IDEOGRAPH-9EC4 # + +2FC9 ; 9ECD ; MA #* ( ⿉ → 黍 ) KANGXI RADICAL MILLET → CJK UNIFIED IDEOGRAPH-9ECD # + +F989 ; 9ECE ; MA # ( 黎 → 黎 ) CJK COMPATIBILITY IDEOGRAPH-F989 → CJK UNIFIED IDEOGRAPH-9ECE # + +2FA16 ; 4D56 ; MA # ( 䵖 → 䵖 ) CJK COMPATIBILITY IDEOGRAPH-2FA16 → CJK UNIFIED IDEOGRAPH-4D56 # + +2FCA ; 9ED1 ; MA #* ( ⿊ → 黑 ) KANGXI RADICAL BLACK → CJK UNIFIED IDEOGRAPH-9ED1 # +9ED2 ; 9ED1 ; MA # ( 黒 → 黑 ) CJK UNIFIED IDEOGRAPH-9ED2 → CJK UNIFIED IDEOGRAPH-9ED1 # →⿊→ + +FA3A ; 58A8 ; MA # ( 墨 → 墨 ) CJK COMPATIBILITY IDEOGRAPH-FA3A → CJK UNIFIED IDEOGRAPH-58A8 # + +2FA17 ; 9EF9 ; MA # ( 黹 → 黹 ) CJK COMPATIBILITY IDEOGRAPH-2FA17 → CJK UNIFIED IDEOGRAPH-9EF9 # +2FCB ; 9EF9 ; MA #* ( ⿋ → 黹 ) KANGXI RADICAL EMBROIDERY → CJK UNIFIED IDEOGRAPH-9EF9 # + +2FCC ; 9EFD ; MA #* ( ⿌ → 黽 ) KANGXI RADICAL FROG → CJK UNIFIED IDEOGRAPH-9EFD # + +2FA18 ; 9EFE ; MA # ( 黾 → 黾 ) CJK COMPATIBILITY IDEOGRAPH-2FA18 → CJK UNIFIED IDEOGRAPH-9EFE # + +2FA19 ; 9F05 ; MA # ( 鼅 → 鼅 ) CJK COMPATIBILITY IDEOGRAPH-2FA19 → CJK UNIFIED IDEOGRAPH-9F05 # + +2FCD ; 9F0E ; MA #* ( ⿍ → 鼎 ) KANGXI RADICAL TRIPOD → CJK UNIFIED IDEOGRAPH-9F0E # + +2FA1A ; 9F0F ; MA # ( 鼏 → 鼏 ) CJK COMPATIBILITY IDEOGRAPH-2FA1A → CJK UNIFIED IDEOGRAPH-9F0F # + +2FCE ; 9F13 ; MA #* ( ⿎ → 鼓 ) KANGXI RADICAL DRUM → CJK UNIFIED IDEOGRAPH-9F13 # + +2FA1B ; 9F16 ; MA # ( 鼖 → 鼖 ) CJK COMPATIBILITY IDEOGRAPH-2FA1B → CJK UNIFIED IDEOGRAPH-9F16 # + +2FCF ; 9F20 ; MA #* ( ⿏ → 鼠 ) KANGXI RADICAL RAT → CJK UNIFIED IDEOGRAPH-9F20 # + +2FA1C ; 9F3B ; MA # ( 鼻 → 鼻 ) CJK COMPATIBILITY IDEOGRAPH-2FA1C → CJK UNIFIED IDEOGRAPH-9F3B # +2FD0 ; 9F3B ; MA #* ( ⿐ → 鼻 ) KANGXI RADICAL NOSE → CJK UNIFIED IDEOGRAPH-9F3B # + +FAD8 ; 9F43 ; MA # ( 齃 → 齃 ) CJK COMPATIBILITY IDEOGRAPH-FAD8 → CJK UNIFIED IDEOGRAPH-9F43 # + +2FD1 ; 9F4A ; MA #* ( ⿑ → 齊 ) KANGXI RADICAL EVEN → CJK UNIFIED IDEOGRAPH-9F4A # + +2EEC ; 9F50 ; MA #* ( ⻬ → 齐 ) CJK RADICAL C-SIMPLIFIED EVEN → CJK UNIFIED IDEOGRAPH-9F50 # + +2FD2 ; 9F52 ; MA #* ( ⿒ → 齒 ) KANGXI RADICAL TOOTH → CJK UNIFIED IDEOGRAPH-9F52 # + +2EEE ; 9F7F ; MA #* ( ⻮ → 齿 ) CJK RADICAL C-SIMPLIFIED TOOTH → CJK UNIFIED IDEOGRAPH-9F7F # + +2FA1D ; 2A600 ; MA # ( 𪘀 → 𪘀 ) CJK COMPATIBILITY IDEOGRAPH-2FA1D → CJK UNIFIED IDEOGRAPH-2A600 # + +F9C4 ; 9F8D ; MA # ( 龍 → 龍 ) CJK COMPATIBILITY IDEOGRAPH-F9C4 → CJK UNIFIED IDEOGRAPH-9F8D # +2FD3 ; 9F8D ; MA #* ( ⿓ → 龍 ) KANGXI RADICAL DRAGON → CJK UNIFIED IDEOGRAPH-9F8D # + +2EF0 ; 9F99 ; MA #* ( ⻰ → 龙 ) CJK RADICAL C-SIMPLIFIED DRAGON → CJK UNIFIED IDEOGRAPH-9F99 # + +FAD9 ; 9F8E ; MA # ( 龎 → 龎 ) CJK COMPATIBILITY IDEOGRAPH-FAD9 → CJK UNIFIED IDEOGRAPH-9F8E # + +F907 ; 9F9C ; MA # ( 龜 → 龜 ) CJK COMPATIBILITY IDEOGRAPH-F907 → CJK UNIFIED IDEOGRAPH-9F9C # +F908 ; 9F9C ; MA # ( 龜 → 龜 ) CJK COMPATIBILITY IDEOGRAPH-F908 → CJK UNIFIED IDEOGRAPH-9F9C # +FACE ; 9F9C ; MA # ( 龜 → 龜 ) CJK COMPATIBILITY IDEOGRAPH-FACE → CJK UNIFIED IDEOGRAPH-9F9C # +2FD4 ; 9F9C ; MA #* ( ⿔ → 龜 ) KANGXI RADICAL TURTLE → CJK UNIFIED IDEOGRAPH-9F9C # + +2EF3 ; 9F9F ; MA #* ( ⻳ → 龟 ) CJK RADICAL C-SIMPLIFIED TURTLE → CJK UNIFIED IDEOGRAPH-9F9F # + +2FD5 ; 9FA0 ; MA #* ( ⿕ → 龠 ) KANGXI RADICAL FLUTE → CJK UNIFIED IDEOGRAPH-9FA0 # + +0CDC ; 0C5C ; MA # ( ೜ → ౜ ) KANNADA ARCHAIC SHRII → TELUGU ARCHAIC SHRII # + +1DE8 ; 1ADA ; MA # ( ᷨ → ᫚ ) COMBINING LATIN SMALL LETTER B → COMBINING FLAT SIGN # + +2DEE ; 1ADB ; MA # ( ⷮ → ᫛ ) COMBINING CYRILLIC LETTER TE → COMBINING DOWN TACK ABOVE # + +1AE7 ; 1AE5 ; MA # ( ᫧ → ᫥ ) COMBINING DOUBLE ARCH ABOVE → COMBINING SEAGULL ABOVE # + +031A ; 1AE9 ; MA # ( ̚ → ᫩ ) COMBINING LEFT ANGLE ABOVE → COMBINING LEFT ANGLE CENTRED ABOVE # + +0295 ; A7CE ; MA # ( ʕ → ꟎ ) LATIN LETTER PHARYNGEAL VOICED FRICATIVE → LATIN CAPITAL LETTER PHARYNGEAL VOICED FRICATIVE # +A7CF ; A7CE ; MA # ( ꟏ → ꟎ ) LATIN SMALL LETTER PHARYNGEAL VOICED FRICATIVE → LATIN CAPITAL LETTER PHARYNGEAL VOICED FRICATIVE # →ʕ→ + +0348 ; 10EFA ; MA # ( ͈ → 𐻺 ) COMBINING DOUBLE VERTICAL LINE BELOW → ARABIC DOUBLE VERTICAL BAR BELOW # + +0956 ; 11B62 ; MA # ( ॖ → 𑭢 ) DEVANAGARI VOWEL SIGN UE → SHARADA VOWEL SIGN UE # +0A41 ; 11B62 ; MA # ( ੁ → 𑭢 ) GURMUKHI VOWEL SIGN U → SHARADA VOWEL SIGN UE # →ॖ→ + +0957 ; 11B63 ; MA # ( ॗ → 𑭣 ) DEVANAGARI VOWEL SIGN UUE → SHARADA VOWEL SIGN UUE # +0A42 ; 11B63 ; MA # ( ੂ → 𑭣 ) GURMUKHI VOWEL SIGN UU → SHARADA VOWEL SIGN UUE # →ॗ→ + +0947 ; 11B64 ; MA # ( े → 𑭤 ) DEVANAGARI VOWEL SIGN E → SHARADA VOWEL SIGN SHORT E # +0A47 ; 11B64 ; MA # ( ੇ → 𑭤 ) GURMUKHI VOWEL SIGN EE → SHARADA VOWEL SIGN SHORT E # →े→ + +5152 ; 16FF3 ; MA # ( 兒 → 𖿳 ) CJK UNIFIED IDEOGRAPH-5152 → CHINESE SMALL TRADITIONAL ER # + +1F40D ; 1CCFA ; MA #* ( 🐍 → 𜳺 ) SNAKE → SNAKE SYMBOL # + +1F443 ; 1CCFC ; MA #* ( 👃 → 𜳼 ) NOSE → NOSE SYMBOL # + +1F377 ; 1CEBA ; MA #* ( 🍷 → 𜺺 ) WINE GLASS → FRAGILE SYMBOL # + +1F3E2 ; 1CEBB ; MA #* ( 🏢 → 𜺻 ) OFFICE BUILDING → OFFICE BUILDING SYMBOL # + +1F333 ; 1CEBC ; MA #* ( 🌳 → 𜺼 ) DECIDUOUS TREE → TREE SYMBOL # + +1F34E ; 1CEBD ; MA #* ( 🍎 → 𜺽 ) RED APPLE → APPLE SYMBOL # +1F34F ; 1CEBD ; MA #* ( 🍏 → 𜺽 ) GREEN APPLE → APPLE SYMBOL # + +1F352 ; 1CEBE ; MA #* ( 🍒 → 𜺾 ) CHERRIES → CHERRY SYMBOL # + +1F353 ; 1CEBF ; MA #* ( 🍓 → 𜺿 ) STRAWBERRY → STRAWBERRY SYMBOL # + +28FF ; 1CEE0 ; MA #* ( ⣿ → 𜻠 ) BRAILLE PATTERN DOTS-12345678 → GEOMANTIC FIGURE POPULUS # + +29B5 ; 1CEF0 ; MA #* ( ⦵ → 𜻰 ) CIRCLE WITH HORIZONTAL BAR → MEDIUM SMALL WHITE CIRCLE WITH HORIZONTAL BAR # + +21C4 ; 1F8D0 ; MA #* ( ⇄ → 🣐 ) RIGHTWARDS ARROW OVER LEFTWARDS ARROW → LONG RIGHTWARDS ARROW OVER LONG LEFTWARDS ARROW # + +21CC ; 1F8D1 ; MA #* ( ⇌ → 🣑 ) RIGHTWARDS HARPOON OVER LEFTWARDS HARPOON → LONG RIGHTWARDS HARPOON OVER LONG LEFTWARDS HARPOON # + +2657 ; 1FA55 ; MA #* ( ♗ → 🩕 ) WHITE CHESS BISHOP → WHITE CHESS ALFIL # + +265D ; 1FA57 ; MA #* ( ♝ → 🩗 ) BLACK CHESS BISHOP → BLACK CHESS ALFIL # + +1F514 ; 1FBFA ; MA #* ( 🔔 → 🯺 ) BELL → ALARM BELL SYMBOL # + +6138 ; 2B73F ; MA # ( 愸 → 𫜿 ) CJK UNIFIED IDEOGRAPH-6138 → CJK UNIFIED IDEOGRAPH-2B73F # + +# total: 6565 + diff --git a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/ExtendedPictographicTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/ExtendedPictographicTest.java new file mode 100644 index 000000000..9325fc584 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/ExtendedPictographicTest.java @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.tokenize.uax29; + +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.ValueSource; + +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertTrue; + +public class ExtendedPictographicTest { + + @ParameterizedTest + @ValueSource(ints = {0x00A9, 0x00AE, 0x203C, 0x2764, 0x1F600, 0x1F468}) + void testPictographicCodePoints(int codePoint) { + assertTrue(ExtendedPictographic.is(codePoint), + () -> String.format("U+%04X should be Extended_Pictographic", codePoint)); + } + + @ParameterizedTest + @ValueSource(ints = {'a', '5', ' ', 0x0301, 0x05D0, 0x1F1E6}) + void testNonPictographicCodePoints(int codePoint) { + // 0x1F1E6 (regional indicator A) is a supplementary code point that is NOT pictographic. + assertFalse(ExtendedPictographic.is(codePoint)); + } + + @ParameterizedTest + @ValueSource(ints = {-1, Integer.MIN_VALUE, Character.MAX_CODE_POINT + 1, Integer.MAX_VALUE}) + void testOutOfRangeIsFalseAndSafe(int codePoint) { + assertFalse(ExtendedPictographic.is(codePoint)); + } +} diff --git a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/WordBoundaryConformanceTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/WordBoundaryConformanceTest.java new file mode 100644 index 000000000..e1bc8231d --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/WordBoundaryConformanceTest.java @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.tokenize.uax29; + +import java.io.BufferedReader; +import java.io.IOException; +import java.io.InputStream; +import java.io.InputStreamReader; +import java.nio.charset.StandardCharsets; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.junit.jupiter.api.Test; + +import static org.junit.jupiter.api.Assertions.assertTrue; + +/** + * Runs the official Unicode {@code WordBreakTest.txt} conformance suite against + * {@link WordSegmenter}. Each line marks boundaries with U+00F7 (division sign) and non-boundaries + * with U+00D7 (multiplication sign) between code points. + */ +public class WordBoundaryConformanceTest { + + private static final int BOUNDARY = 0x00F7; // division sign + private static final int NO_BOUNDARY = 0x00D7; // multiplication sign + + @Test + void testOfficialUnicodeWordBreakConformance() throws IOException { + int total = 0; + int passed = 0; + final List failures = new ArrayList<>(); + + try (InputStream in = WordBoundaryConformanceTest.class.getResourceAsStream("WordBreakTest.txt"); + BufferedReader reader = + new BufferedReader(new InputStreamReader(in, StandardCharsets.UTF_8))) { + String raw; + int lineNumber = 0; + while ((raw = reader.readLine()) != null) { + lineNumber++; + final int hash = raw.indexOf('#'); + final String content = (hash < 0 ? raw : raw.substring(0, hash)).strip(); + if (content.isEmpty()) { + continue; + } + final String[] tokens = content.split("\\s+"); + + final StringBuilder text = new StringBuilder(); + final List expected = new ArrayList<>(); + expected.add(0); // tokens[0] is always a leading boundary marker. + int offset = 0; + for (int k = 1; k < tokens.length; k += 2) { + final int codePoint = Integer.parseInt(tokens[k], 16); + text.appendCodePoint(codePoint); + offset += Character.charCount(codePoint); + if (tokens[k + 1].codePointAt(0) == BOUNDARY) { + expected.add(offset); + } + } + + final int[] actual = WordSegmenter.boundaries(text); + final int[] expectedArray = expected.stream().mapToInt(Integer::intValue).toArray(); + total++; + if (Arrays.equals(actual, expectedArray)) { + passed++; + } else if (failures.size() < 25) { + failures.add("line " + lineNumber + ": " + content + + "\n expected=" + Arrays.toString(expectedArray) + + "\n actual =" + Arrays.toString(actual)); + } + } + } + + final int passRate = total == 0 ? 0 : passed * 100 / total; + System.out.println("UAX#29 word-break conformance: " + passed + "/" + total + + " (" + passRate + "%)"); + assertTrue(total > 1900, "expected the full conformance suite to load, ran only " + total); + assertTrue(failures.isEmpty(), + "UAX#29 word-break conformance: " + passed + "/" + total + " (" + passRate + + "%). First failures:\n" + String.join("\n", failures)); + } +} diff --git a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/WordBreakPropertyTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/WordBreakPropertyTest.java new file mode 100644 index 000000000..5735fca03 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/WordBreakPropertyTest.java @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.tokenize.uax29; + +import org.junit.jupiter.api.Test; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.ValueSource; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertSame; + +public class WordBreakPropertyTest { + + @Test + void testAsciiLettersAndDigits() { + assertSame(WordBreak.ALETTER, WordBreakProperty.of('a')); + assertSame(WordBreak.ALETTER, WordBreakProperty.of('Z')); + assertSame(WordBreak.NUMERIC, WordBreakProperty.of('0')); + assertSame(WordBreak.NUMERIC, WordBreakProperty.of('9')); + } + + @Test + void testWhitespaceAndLineBreaks() { + assertSame(WordBreak.WSEG_SPACE, WordBreakProperty.of(0x0020)); // space + assertSame(WordBreak.CR, WordBreakProperty.of(0x000D)); + assertSame(WordBreak.LF, WordBreakProperty.of(0x000A)); + assertSame(WordBreak.NEWLINE, WordBreakProperty.of(0x000B)); // vertical tab + } + + @Test + void testMidAndExtendClasses() { + assertSame(WordBreak.MID_NUM, WordBreakProperty.of(0x002C)); // comma + assertSame(WordBreak.MID_NUM_LET, WordBreakProperty.of(0x002E)); // full stop + assertSame(WordBreak.MID_LETTER, WordBreakProperty.of(0x003A)); // colon + assertSame(WordBreak.EXTEND_NUM_LET, WordBreakProperty.of(0x005F)); // low line + assertSame(WordBreak.EXTEND, WordBreakProperty.of(0x0301)); // combining acute + } + + @Test + void testQuotesJoinerAndScriptLetters() { + assertSame(WordBreak.SINGLE_QUOTE, WordBreakProperty.of(0x0027)); + assertSame(WordBreak.DOUBLE_QUOTE, WordBreakProperty.of(0x0022)); + assertSame(WordBreak.ZWJ, WordBreakProperty.of(0x200D)); + assertSame(WordBreak.HEBREW_LETTER, WordBreakProperty.of(0x05D0)); + assertSame(WordBreak.KATAKANA, WordBreakProperty.of(0x30A1)); + } + + @Test + void testSupplementaryCodePointsUseTheRangeTable() { + assertSame(WordBreak.REGIONAL_INDICATOR, WordBreakProperty.of(0x1F1E6)); // regional indicator A + assertSame(WordBreak.ALETTER, WordBreakProperty.of(0x1D400)); // math bold A + assertSame(WordBreak.OTHER, WordBreakProperty.of(0x1F600)); // grinning face + } + + @ParameterizedTest + @ValueSource(ints = {0x0021, 0x0040, 0x2014}) + void testUnassignedCodePointsAreOther(int codePoint) { + assertSame(WordBreak.OTHER, WordBreakProperty.of(codePoint)); + } + + @ParameterizedTest + @ValueSource(ints = {-1, Integer.MIN_VALUE, Character.MAX_CODE_POINT + 1, Integer.MAX_VALUE}) + void testOutOfRangeIsOtherAndSafe(int codePoint) { + assertSame(WordBreak.OTHER, WordBreakProperty.of(codePoint)); + } + + @Test + void testFromPropertyNameRejectsUnknown() { + assertEquals(WordBreak.ALETTER, WordBreak.fromPropertyName("ALetter")); + org.junit.jupiter.api.Assertions.assertThrows(IllegalArgumentException.class, + () -> WordBreak.fromPropertyName("NotAValue")); + } +} diff --git a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/WordSegmenterTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/WordSegmenterTest.java new file mode 100644 index 000000000..96dbc1491 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/WordSegmenterTest.java @@ -0,0 +1,110 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.tokenize.uax29; + +import java.util.ArrayList; +import java.util.List; + +import org.junit.jupiter.api.Test; + +import opennlp.tools.util.Span; + +import static org.junit.jupiter.api.Assertions.assertArrayEquals; +import static org.junit.jupiter.api.Assertions.assertEquals; + +public class WordSegmenterTest { + + private static String cp(int codePoint) { + return new String(Character.toChars(codePoint)); + } + + private static List words(String text) { + final List out = new ArrayList<>(); + for (final Span span : WordSegmenter.segments(text)) { + out.add(span.getCoveredText(text).toString()); + } + return out; + } + + @Test + void testEnglishSentenceKeepsWordsAndSeparators() { + assertEquals(List.of("The", " ", "quick", " ", "fox"), words("The quick fox")); + } + + @Test + void testContractionStaysOneWord() { + assertEquals(List.of("don't"), words("don't")); // WB6/WB7 over the apostrophe + } + + @Test + void testDecimalNumberStaysOneToken() { + assertEquals(List.of("3.14"), words("3.14")); // WB11/WB12 + } + + @Test + void testAcronymWithInternalDotsStaysOneToken() { + assertEquals(List.of("U.S.A"), words("U.S.A")); // WB6/WB7 + } + + @Test + void testLettersAndDigitsJoin() { + assertEquals(List.of("a1b2"), words("a1b2")); // WB9/WB10 + } + + @Test + void testWhitespaceRunIsASingleSegment() { + assertEquals(List.of("a", " ", "b"), words("a b")); // WB3d + } + + @Test + void testNewlineBreaksOnBothSides() { + assertEquals(List.of("a", "\n", "b"), words("a\nb")); // WB3a/WB3b + } + + @Test + void testCarriageReturnLineFeedStayTogether() { + assertEquals(List.of("a", "\r\n", "b"), words("a\r\nb")); // WB3 + } + + @Test + void testIdeographsSplitPerCharacter() { + assertEquals(List.of(cp(0x4E2D), cp(0x6587)), words(cp(0x4E2D) + cp(0x6587))); + } + + @Test + void testEmojiZwjSequenceStaysTogether() { + final String family = cp(0x1F468) + cp(0x200D) + cp(0x1F469); // man + ZWJ + woman + assertEquals(List.of(family), words(family)); // WB3c + } + + @Test + void testRegionalIndicatorFlagIsOneToken() { + final String flag = cp(0x1F1FA) + cp(0x1F1F8); // regional indicators U + S + assertEquals(List.of(flag), words(flag)); // WB15/WB16 + } + + @Test + void testEmptyText() { + assertEquals(List.of(), words("")); + assertArrayEquals(new int[] {0}, WordSegmenter.boundaries("")); + } + + @Test + void testBoundariesIncludeStartAndEnd() { + assertArrayEquals(new int[] {0, 2, 3, 5}, WordSegmenter.boundaries("ab cd")); + } +} diff --git a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/WordTokenizerTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/WordTokenizerTest.java new file mode 100644 index 000000000..83e8c20fb --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/tokenize/uax29/WordTokenizerTest.java @@ -0,0 +1,164 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.tokenize.uax29; + +import java.util.List; + +import org.junit.jupiter.api.Test; + +import opennlp.tools.tokenize.Tokenizer; +import opennlp.tools.util.Span; + +import static org.junit.jupiter.api.Assertions.assertArrayEquals; +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertThrows; + +public class WordTokenizerTest { + + private static final WordTokenizer TOKENIZER = new WordTokenizer(); + + private static String cp(int codePoint) { + return new String(Character.toChars(codePoint)); + } + + private static List words(String text) { + return List.of(TOKENIZER.tokenize(text)); + } + + @Test + void testDropsWhitespaceAndPunctuation() { + assertEquals(List.of("Hello", "world"), words("Hello, world!")); + } + + @Test + void testAlphanumericAndNumericTypes() { + final List tokens = TOKENIZER.tokenizeTyped("abc 123"); + assertEquals(2, tokens.size()); + assertEquals(WordType.ALPHANUMERIC, tokens.get(0).type()); + assertEquals("abc", tokens.get(0).text("abc 123")); + assertEquals(WordType.NUMERIC, tokens.get(1).type()); + assertEquals("123", tokens.get(1).text("abc 123")); + } + + @Test + void testDecimalIsSingleNumericToken() { + final List tokens = TOKENIZER.tokenizeTyped("3.14"); + assertEquals(1, tokens.size()); + assertEquals(WordType.NUMERIC, tokens.get(0).type()); + assertEquals("3.14", tokens.get(0).text("3.14")); + } + + @Test + void testIdeographsOnePerToken() { + final String text = cp(0x4E2D) + cp(0x6587); + final List tokens = TOKENIZER.tokenizeTyped(text); + assertEquals(2, tokens.size()); + assertEquals(WordType.IDEOGRAPHIC, tokens.get(0).type()); + assertEquals(WordType.IDEOGRAPHIC, tokens.get(1).type()); + } + + @Test + void testHiraganaSplitsPerCharacter() { + final String text = cp(0x3042) + cp(0x3044); // a + i + final List tokens = TOKENIZER.tokenizeTyped(text); + assertEquals(2, tokens.size()); + assertEquals(WordType.HIRAGANA, tokens.get(0).type()); + assertEquals(WordType.HIRAGANA, tokens.get(1).type()); + } + + @Test + void testKatakanaRunStaysTogether() { + final String text = cp(0x30A2) + cp(0x30A4); // a + i + final List tokens = TOKENIZER.tokenizeTyped(text); + assertEquals(1, tokens.size()); + assertEquals(WordType.KATAKANA, tokens.get(0).type()); + assertEquals(text, tokens.get(0).text(text)); + } + + @Test + void testHangulSyllablesStayTogether() { + final String text = cp(0xAC00) + cp(0xB098); // ga + na + final List tokens = TOKENIZER.tokenizeTyped(text); + assertEquals(1, tokens.size()); + assertEquals(WordType.HANGUL, tokens.get(0).type()); + assertEquals(text, tokens.get(0).text(text)); + } + + @Test + void testSoutheastAsianType() { + final String text = cp(0x0E01); // Thai letter ko kai + final List tokens = TOKENIZER.tokenizeTyped(text); + assertEquals(1, tokens.size()); + assertEquals(WordType.SOUTHEAST_ASIAN, tokens.get(0).type()); + } + + @Test + void testEmojiType() { + final String text = cp(0x1F600); // grinning face + final List tokens = TOKENIZER.tokenizeTyped(text); + assertEquals(1, tokens.size()); + assertEquals(WordType.EMOJI, tokens.get(0).type()); + } + + @Test + void testRegionalIndicatorFlagIsOneEmoji() { + final String flag = cp(0x1F1FA) + cp(0x1F1F8); // U + S + final List tokens = TOKENIZER.tokenizeTyped(flag); + assertEquals(1, tokens.size()); + assertEquals(WordType.EMOJI, tokens.get(0).type()); + assertEquals(flag, tokens.get(0).text(flag)); + } + + @Test + void testMaxTokenLengthChopsLongWords() { + final WordTokenizer tokenizer = new WordTokenizer(3); + assertEquals(List.of("abc", "def", "g"), List.of(tokenizer.tokenize("abcdefg"))); + } + + @Test + void testMaxTokenLengthNeverSplitsASurrogatePair() { + // A two-char emoji must be emitted whole even when the limit is one char. + final WordTokenizer tokenizer = new WordTokenizer(1); + final String emoji = cp(0x1F600); + final List tokens = tokenizer.tokenizeTyped(emoji); + assertEquals(1, tokens.size()); + assertEquals(emoji, tokens.get(0).text(emoji)); + } + + @Test + void testConstructorRejectsNonPositiveLength() { + assertThrows(IllegalArgumentException.class, () -> new WordTokenizer(0)); + assertThrows(IllegalArgumentException.class, () -> new WordTokenizer(-5)); + } + + @Test + void testEmptyText() { + assertEquals(List.of(), words("")); + assertEquals(List.of(), TOKENIZER.tokenizeTyped("")); + } + + @Test + void testUsableThroughTokenizerInterface() { + final Tokenizer tokenizer = new WordTokenizer(); + final String text = "Hello, world!"; + assertArrayEquals(new String[] {"Hello", "world"}, tokenizer.tokenize(text)); + final Span[] spans = tokenizer.tokenizePos(text); + assertEquals(2, spans.length); + assertEquals("Hello", spans[0].getCoveredText(text).toString()); + assertEquals("world", spans[1].getCoveredText(text).toString()); + } +} diff --git a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/CaseFoldCharSequenceNormalizerTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/CaseFoldCharSequenceNormalizerTest.java new file mode 100644 index 000000000..2e5faa6a1 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/CaseFoldCharSequenceNormalizerTest.java @@ -0,0 +1,63 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.Locale; + +import org.junit.jupiter.api.Test; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertSame; +import static org.junit.jupiter.api.Assertions.assertThrows; + +public class CaseFoldCharSequenceNormalizerTest { + + private static String cp(int codePoint) { + return new String(Character.toChars(codePoint)); + } + + @Test + void testRootLowerCases() { + assertEquals("hello world", + CaseFoldCharSequenceNormalizer.getInstance().normalize("HeLLo WORLD").toString()); + } + + @Test + void testRootKeepsAsciiIForCapitalI() { + // Locale.ROOT lower-cases "I" to ASCII "i" (no Turkish dotless surprise). + assertEquals("i", CaseFoldCharSequenceNormalizer.getInstance().normalize("I").toString()); + } + + @Test + void testTurkishLocaleUsesDotlessI() { + final CaseFoldCharSequenceNormalizer turkish = + CaseFoldCharSequenceNormalizer.getInstance(Locale.forLanguageTag("tr")); + assertEquals(cp(0x0131), turkish.normalize("I").toString()); // dotless lower-case i + } + + @Test + void testGetInstanceForRootReturnsTheSharedInstance() { + assertSame(CaseFoldCharSequenceNormalizer.getInstance(), + CaseFoldCharSequenceNormalizer.getInstance(Locale.ROOT)); + } + + @Test + void testNullLocaleIsRejected() { + assertThrows(NullPointerException.class, + () -> new CaseFoldCharSequenceNormalizer((Locale) null)); + } +} diff --git a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/ConfusablesTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/ConfusablesTest.java new file mode 100644 index 000000000..262fe5aa9 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/ConfusablesTest.java @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import org.junit.jupiter.api.Test; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertTrue; + +public class ConfusablesTest { + + private static String cp(int codePoint) { + return new String(Character.toChars(codePoint)); + } + + @Test + void testCyrillicLetterIsConfusableWithLatin() { + final String cyrillicA = cp(0x0430); // CYRILLIC SMALL LETTER A, looks like Latin 'a' + assertTrue(Confusables.confusable(cyrillicA, "a")); + assertFalse(Confusables.confusable(cyrillicA, "b")); + } + + @Test + void testHomoglyphSpoofWordReducesToLatinSpelling() { + final String spoof = "p" + cp(0x0430) + "yp" + cp(0x0430) + "l"; // paypal with Cyrillic a's + assertTrue(Confusables.confusable(spoof, "paypal")); + assertEquals(Confusables.skeleton("paypal"), Confusables.skeleton(spoof)); + } + + @Test + void testHorizontalEllipsisFoldsToThreeFullStops() { + assertEquals(Confusables.skeleton("..."), Confusables.skeleton(cp(0x2026))); + assertTrue(Confusables.confusable(cp(0x2026), "...")); + } + + @Test + void testDistinctWordsAreNotConfusable() { + assertFalse(Confusables.confusable("cat", "dog")); + } + + @Test + void testSkeletonIsIdempotent() { + final String skeleton = Confusables.skeleton(cp(0x0430) + "bc"); + assertEquals(skeleton, Confusables.skeleton(skeleton)); + } + + @Test + void testNormalizerProducesTheSkeleton() { + final String spoof = "p" + cp(0x0430) + "yp" + cp(0x0430) + "l"; + assertEquals(Confusables.skeleton(spoof), + ConfusableSkeletonCharSequenceNormalizer.getInstance().normalize(spoof).toString()); + } + + @Test + void testMultipleCyrillicLookalikesFold() { + final String spoof = "d" + cp(0x0430) + "t" + cp(0x0430); // "data" with Cyrillic a's + assertEquals(Confusables.skeleton("data"), Confusables.skeleton(spoof)); + } + + @Test + void testTermConfusableFoldDimension() { + final String spoof = "p" + cp(0x0430) + "yp" + cp(0x0430) + "l"; + final TermAnalyzer analyzer = TermAnalyzer.builder().confusableFold().build(); + assertEquals(Confusables.skeleton("paypal"), analyzer.analyze(spoof).get(0).normalized()); + } +} diff --git a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/GermanUmlautCharSequenceNormalizerTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/GermanUmlautCharSequenceNormalizerTest.java new file mode 100644 index 000000000..81bbf1c91 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/GermanUmlautCharSequenceNormalizerTest.java @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import org.junit.jupiter.api.Test; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +public class GermanUmlautCharSequenceNormalizerTest { + + private static final GermanUmlautCharSequenceNormalizer FOLD = + GermanUmlautCharSequenceNormalizer.getInstance(); + + private static String cp(int codePoint) { + return new String(Character.toChars(codePoint)); + } + + private static String fold(String text) { + return FOLD.normalize(text).toString(); + } + + @Test + void testLowercaseUmlauts() { + assertEquals("Mueller", fold("M" + cp(0x00FC) + "ller")); // Mueller + assertEquals("schoen", fold("sch" + cp(0x00F6) + "n")); // schoen + assertEquals("aerger", fold(cp(0x00E4) + "rger")); // aerger + } + + @Test + void testCapitalUmlauts() { + assertEquals("Oel", fold(cp(0x00D6) + "l")); // Oel + assertEquals("Aerger", fold(cp(0x00C4) + "rger")); // Aerger + assertEquals("Ueber", fold(cp(0x00DC) + "ber")); // Ueber + } + + @Test + void testEszett() { + assertEquals("Strasse", fold("Stra" + cp(0x00DF) + "e")); // Strasse + } + + @Test + void testAsciiAndOtherCharactersUnchanged() { + assertEquals("hello world 123", fold("hello world 123")); + } + + @Test + void testMixedSentence() { + final String input = "M" + cp(0x00FC) + "ller Stra" + cp(0x00DF) + "e"; // Mueller Strasse + assertEquals("Mueller Strasse", fold(input)); + } +} diff --git a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/NormalizationProfilesTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/NormalizationProfilesTest.java new file mode 100644 index 000000000..2d2c02b38 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/NormalizationProfilesTest.java @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.List; + +import org.junit.jupiter.api.Test; + +import opennlp.tools.langdetect.Language; +import opennlp.tools.langdetect.LanguageDetector; +import opennlp.tools.stemmer.snowball.SnowballStemmer; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertNull; +import static org.junit.jupiter.api.Assertions.assertSame; +import static org.junit.jupiter.api.Assertions.assertTrue; + +public class NormalizationProfilesTest { + + @Test + void testEnglishUsesTheGenericAccentFold() { + final NormalizationProfile profile = NormalizationProfiles.forLanguage("eng").orElseThrow(); + assertEquals(SnowballStemmer.ALGORITHM.ENGLISH, profile.stemmerAlgorithm()); + assertSame(AccentFoldCharSequenceNormalizer.getInstance(), profile.accentFold()); + assertEquals(List.of(Dimension.NFC, Dimension.CASE_FOLD, Dimension.ACCENT_FOLD, Dimension.STEM), + profile.searchAnalyzer().dimensions()); + } + + @Test + void testTwoLetterCodeResolvesToProfile() { + assertEquals(SnowballStemmer.ALGORITHM.GERMAN, + NormalizationProfiles.forLanguage("de").orElseThrow().stemmerAlgorithm()); + } + + @Test + void testGermanUsesTheGermanSpecificFold() { + final NormalizationProfile profile = NormalizationProfiles.forLanguage("deu").orElseThrow(); + assertSame(GermanUmlautCharSequenceNormalizer.getInstance(), profile.accentFold()); + assertEquals(List.of(Dimension.NFC, Dimension.CASE_FOLD, Dimension.ACCENT_FOLD, Dimension.STEM), + profile.searchAnalyzer().dimensions()); + } + + @Test + void testRomanceLanguagesUseTheGenericFold() { + for (final String language : List.of("fra", "spa", "por", "ita", "cat")) { + assertSame(AccentFoldCharSequenceNormalizer.getInstance(), + NormalizationProfiles.forLanguage(language).orElseThrow().accentFold()); + } + } + + @Test + void testNordicLanguageHasNoFold() { + final NormalizationProfile swedish = NormalizationProfiles.forLanguage("swe").orElseThrow(); + assertNull(swedish.accentFold()); + assertEquals(List.of(Dimension.NFC, Dimension.CASE_FOLD, Dimension.STEM), + swedish.searchAnalyzer().dimensions()); + } + + @Test + void testUnsupportedLanguageIsEmpty() { + assertTrue(NormalizationProfiles.forLanguage("jpn").isEmpty()); + assertTrue(NormalizationProfiles.forLanguage("zzz").isEmpty()); + } + + @Test + void testSearchAnalyzerStemsThroughTheChain() { + final NormalizationProfile english = NormalizationProfiles.forLanguage("eng").orElseThrow(); + assertEquals("cat", english.searchAnalyzer().analyze("Cats").get(0).normalized()); + } + + @Test + void testDetectDispatchesThroughTheDetector() { + final LanguageDetector detector = new LanguageDetector() { + @Override + public Language[] predictLanguages(CharSequence content) { + return new Language[] {new Language("deu")}; + } + + @Override + public Language predictLanguage(CharSequence content) { + return new Language("deu"); + } + + @Override + public String[] getSupportedLanguages() { + return new String[] {"deu"}; + } + }; + final NormalizationProfile profile = + NormalizationProfiles.detect("Guten Tag", detector).orElseThrow(); + assertEquals(SnowballStemmer.ALGORITHM.GERMAN, profile.stemmerAlgorithm()); + } + + @Test + void testDetectUnsupportedLanguageIsEmpty() { + final LanguageDetector detector = new LanguageDetector() { + @Override + public Language[] predictLanguages(CharSequence content) { + return new Language[] {new Language("jpn")}; + } + + @Override + public Language predictLanguage(CharSequence content) { + return new Language("jpn"); + } + + @Override + public String[] getSupportedLanguages() { + return new String[] {"jpn"}; + } + }; + assertTrue(NormalizationProfiles.detect("text", detector).isEmpty()); + } + + @Test + void testSupportedLanguagesCoverTheSnowballSet() { + assertEquals(19, NormalizationProfiles.supportedLanguages().size()); + assertTrue(NormalizationProfiles.supportedLanguages().containsAll(List.of("eng", "deu", "fra"))); + } +} diff --git a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/TermAnalyzerTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/TermAnalyzerTest.java new file mode 100644 index 000000000..56f16899d --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/TermAnalyzerTest.java @@ -0,0 +1,211 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import java.util.List; +import java.util.Locale; +import java.util.Set; + +import org.junit.jupiter.api.Test; + +import opennlp.tools.lemmatizer.Lemmatizer; +import opennlp.tools.stemmer.PorterStemmer; +import opennlp.tools.util.Span; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertNull; +import static org.junit.jupiter.api.Assertions.assertSame; +import static org.junit.jupiter.api.Assertions.assertThrows; + +public class TermAnalyzerTest { + + private static String cp(int codePoint) { + return new String(Character.toChars(codePoint)); + } + + @Test + void testNoDimensionsLeavesTokenUnchanged() { + final TermAnalyzer analyzer = TermAnalyzer.builder().build(); + final Term term = analyzer.analyze("Hello").get(0); + assertEquals("Hello", term.original()); + assertEquals("Hello", term.normalized()); + assertEquals("Hello", term.peel()); + assertEquals(List.of(), analyzer.dimensions()); + } + + @Test + void testChainAppliesInCanonicalOrderRegardlessOfBuilderOrder() { + // accentFold added before caseFold, but the canonical order is caseFold then accentFold. + final TermAnalyzer analyzer = TermAnalyzer.builder().accentFold().caseFold().build(); + assertEquals(List.of(Dimension.CASE_FOLD, Dimension.ACCENT_FOLD), analyzer.dimensions()); + final String input = "CAF" + cp(0x00C9); // CAFE with capital acute E + final Term term = analyzer.analyze(input).get(0); + assertEquals(input, term.original()); + assertEquals("cafe", term.normalized()); + assertEquals("caf" + cp(0x00E9), term.peel()); // before accent folding: lower-case, acute kept + } + + @Test + void testStemIsTheTopLayer() { + final TermAnalyzer analyzer = + TermAnalyzer.builder().caseFold().stem(new PorterStemmer()).build(); + final Term term = analyzer.analyze("Running").get(0); + assertEquals("running", term.peel()); // case-folded form, before stemming + assertEquals("run", term.normalized()); + assertEquals("run", term.at(Dimension.STEM)); + } + + @Test + void testUnconfiguredCharDimensionComputedLazily() { + final TermAnalyzer analyzer = TermAnalyzer.builder().build(); + final Term term = analyzer.analyze("HELLO").get(0); + assertEquals("HELLO", term.normalized()); + assertEquals("hello", term.at(Dimension.CASE_FOLD)); // lazily added on top of the final form + } + + @Test + void testStemDimensionWithoutStemmerFailsLoudly() { + final TermAnalyzer analyzer = TermAnalyzer.builder().caseFold().build(); + final Term term = analyzer.analyze("running").get(0); + assertThrows(IllegalStateException.class, () -> term.at(Dimension.STEM)); + } + + @Test + void testLemmaWithoutLemmatizerFailsLoudly() { + final TermAnalyzer analyzer = TermAnalyzer.builder().build(); + final Term term = analyzer.analyze("running").get(0); + assertThrows(IllegalStateException.class, () -> term.at(Dimension.LEMMA)); + } + + @Test + void testAnalyzeTextProducesSpans() { + final TermAnalyzer analyzer = TermAnalyzer.builder().caseFold().build(); + final List terms = analyzer.analyze("The Cats"); + assertEquals(2, terms.size()); + assertEquals("The", terms.get(0).original()); + assertEquals("the", terms.get(0).normalized()); + assertEquals(new Span(0, 3), terms.get(0).span()); + assertEquals("Cats", terms.get(1).original()); + assertEquals(new Span(4, 8), terms.get(1).span()); + } + + @Test + void testAnalyzeTokensHasNoSpan() { + final TermAnalyzer analyzer = TermAnalyzer.builder().caseFold().build(); + final List terms = analyzer.analyze(new String[] {"Cats"}, new String[] {"NNS"}); + assertNull(terms.get(0).span()); + assertEquals("cats", terms.get(0).normalized()); + } + + @Test + void testAnalyzeTokensRejectsLengthMismatch() { + final TermAnalyzer analyzer = TermAnalyzer.builder().build(); + assertThrows(IllegalArgumentException.class, + () -> analyzer.analyze(new String[] {"a", "b"}, new String[] {"X"})); + } + + @Test + void testTransformRejectsNonCharacterDimension() { + assertThrows(IllegalArgumentException.class, () -> TermAnalyzer.builder() + .transform(Dimension.STEM, CaseFoldCharSequenceNormalizer.getInstance())); + } + + @Test + void testLemmaWithLemmatizerAndTag() { + final Lemmatizer lemmatizer = new Lemmatizer() { + @Override + public String[] lemmatize(String[] tokens, String[] tags) { + return new String[] {"be"}; + } + + @Override + public List> lemmatize(List tokens, List tags) { + return List.of(List.of("be")); + } + }; + final TermAnalyzer analyzer = + TermAnalyzer.builder().caseFold().lemmatize(lemmatizer).build(); + final Term term = analyzer.analyze(new String[] {"was"}, new String[] {"VBD"}).get(0); + assertEquals("be", term.normalized()); + } + + @Test + void testConfusableFoldComposesWithCaseFold() { + final TermAnalyzer analyzer = TermAnalyzer.builder().caseFold().confusableFold().build(); + final String spoof = "P" + cp(0x0430) + "yp" + cp(0x0430) + "l"; // Paypal with Cyrillic a's + assertEquals(Confusables.skeleton("paypal"), analyzer.analyze(spoof).get(0).normalized()); + } + + @Test + void testAtIsMemoized() { + final TermAnalyzer analyzer = TermAnalyzer.builder().build(); + final Term term = analyzer.analyze("HELLO").get(0); + final String first = term.at(Dimension.CASE_FOLD); + assertSame(first, term.at(Dimension.CASE_FOLD)); + } + + @Test + void testWhitespaceTargetIsConfigurable() { + final CharClass lineFold = CharClass.of(CodePointSet.of('\n', '\t'), '\n'); + final TermAnalyzer analyzer = TermAnalyzer.builder().whitespace(lineFold::collapse).build(); + final Term term = analyzer.analyze(new String[] {"a\n\n\tb"}, new String[] {"X"}).get(0); + assertEquals("a\nb", term.normalized()); + } + + @Test + void testCaseFoldLocaleAppliesTurkishRules() { + final TermAnalyzer analyzer = + TermAnalyzer.builder().caseFold(Locale.forLanguageTag("tr")).build(); + assertEquals(cp(0x0131), analyzer.analyze("I").get(0).normalized()); // dotless lowercase i + } + + @Test + void testAccentFoldScopeFoldsLatin() { + final TermAnalyzer analyzer = TermAnalyzer.builder() + .accentFold(Set.of(Character.UnicodeScript.LATIN), false).build(); + assertEquals("cafe", analyzer.analyze("caf" + cp(0x00E9)).get(0).normalized()); // cafe + acute + } + + @Test + void testMaxTokenLengthChopsTokens() { + final List terms = TermAnalyzer.builder().maxTokenLength(3).build().analyze("abcdefg"); + assertEquals(3, terms.size()); + assertEquals("abc", terms.get(0).original()); + assertEquals("def", terms.get(1).original()); + assertEquals("g", terms.get(2).original()); + } + + @Test + void testAnalyzeEmptyTextProducesNoTerms() { + assertEquals(List.of(), TermAnalyzer.builder().caseFold().build().analyze("")); + } + + @Test + void testWhitespaceOnlyInputHasNoWordTerms() { + assertEquals(List.of(), TermAnalyzer.builder().build().analyze(" \t ")); + } + + @Test + void testAtDimensionBelowFinalIsAppliedOnTop() { + // Final dimension is STEM; asking for NFC applies it on top of the stem (documented behavior). + final TermAnalyzer analyzer = + TermAnalyzer.builder().caseFold().stem(new PorterStemmer()).build(); + final Term term = analyzer.analyze("Running").get(0); + assertEquals("run", term.normalized()); + assertEquals("run", term.at(Dimension.NFC)); + } +} diff --git a/opennlp-core/opennlp-runtime/src/test/resources/opennlp/tools/tokenize/uax29/WordBreakTest.txt b/opennlp-core/opennlp-runtime/src/test/resources/opennlp/tools/tokenize/uax29/WordBreakTest.txt new file mode 100644 index 000000000..042b02e77 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/test/resources/opennlp/tools/tokenize/uax29/WordBreakTest.txt @@ -0,0 +1,1974 @@ +# WordBreakTest-17.0.0.txt +# Date: 2025-03-24, 14:46:35 GMT +# © 2025 Unicode®, Inc. +# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries. +# For terms of use and license, see https://www.unicode.org/terms_of_use.html +# +# Unicode Character Database +# For documentation, see https://www.unicode.org/reports/tr44/ +# +# Default Word_Break Test +# +# Format: +# (# )? +# contains hex Unicode code points, with +# ÷ wherever there is a break opportunity, and +# × wherever there is not. +# the format can change, but currently it shows: +# - the sample character name +# - (x) the Word_Break property value for the sample character and +# any other properties relevant to the algorithm, as described in +# WordBreakTest.html +# - [x] the rule that determines whether there is a break or not, +# as listed in the Rules section of WordBreakTest.html +# +# These samples may be extended or changed in the future. +# +÷ 000D ÷ 000D ÷ # ÷ [0.2] (CR) ÷ [3.1] (CR) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 000D ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 000D × 000A ÷ # ÷ [0.2] (CR) × [3.0] (LF) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 000A ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 000D ÷ 000B ÷ # ÷ [0.2] (CR) ÷ [3.1] (Newline) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 000B ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 000D ÷ 0300 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 000D ÷ 0308 × 0300 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 000D ÷ 00AD ÷ # ÷ [0.2] (CR) ÷ [3.1] SOFT HYPHEN (Format) ÷ [0.3] +÷ 000D ÷ 0308 × 00AD ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 000D ÷ 3031 ÷ # ÷ [0.2] (CR) ÷ [3.1] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 3031 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 000D ÷ 24C2 ÷ # ÷ [0.2] (CR) ÷ [3.1] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 24C2 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 000D ÷ 0041 ÷ # ÷ [0.2] (CR) ÷ [3.1] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 0041 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 000D ÷ 003A ÷ # ÷ [0.2] (CR) ÷ [3.1] COLON (MidLetter) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 003A ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 000D ÷ 002C ÷ # ÷ [0.2] (CR) ÷ [3.1] COMMA (MidNum) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 002C ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 000D ÷ 002E ÷ # ÷ [0.2] (CR) ÷ [3.1] FULL STOP (MidNumLet) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 002E ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 000D ÷ 0030 ÷ # ÷ [0.2] (CR) ÷ [3.1] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 0030 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 000D ÷ 005F ÷ # ÷ [0.2] (CR) ÷ [3.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 005F ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 000D ÷ 1F1E6 ÷ # ÷ [0.2] (CR) ÷ [3.1] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 1F1E6 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 000D ÷ 05D0 ÷ # ÷ [0.2] (CR) ÷ [3.1] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 05D0 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 000D ÷ 0022 ÷ # ÷ [0.2] (CR) ÷ [3.1] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 0022 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 000D ÷ 0027 ÷ # ÷ [0.2] (CR) ÷ [3.1] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 0027 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000D ÷ 200D ÷ # ÷ [0.2] (CR) ÷ [3.1] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 000D ÷ 0308 × 200D ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 000D ÷ 00A9 ÷ # ÷ [0.2] (CR) ÷ [3.1] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 00A9 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 000D ÷ 0020 ÷ # ÷ [0.2] (CR) ÷ [3.1] SPACE (WSegSpace) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 0020 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 000D ÷ 0000 ÷ # ÷ [0.2] (CR) ÷ [3.1] (XXmExtPict) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 0000 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 000D ÷ 0061 × 2060 ÷ # ÷ [0.2] (CR) ÷ [3.1] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000D ÷ 0061 ÷ 003A ÷ # ÷ [0.2] (CR) ÷ [3.1] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 000D ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] (CR) ÷ [3.1] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000D ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] (CR) ÷ [3.1] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000D ÷ 0061 ÷ 002C ÷ # ÷ [0.2] (CR) ÷ [3.1] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 000D ÷ 0031 ÷ 003A ÷ # ÷ [0.2] (CR) ÷ [3.1] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 000D ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] (CR) ÷ [3.1] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000D ÷ 0031 ÷ 002C ÷ # ÷ [0.2] (CR) ÷ [3.1] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 000D ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] (CR) ÷ [3.1] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000D ÷ 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] (CR) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000A ÷ 000D ÷ # ÷ [0.2] (LF) ÷ [3.1] (CR) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 000D ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 000A ÷ 000A ÷ # ÷ [0.2] (LF) ÷ [3.1] (LF) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 000A ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 000A ÷ 000B ÷ # ÷ [0.2] (LF) ÷ [3.1] (Newline) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 000B ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 000A ÷ 0300 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 000A ÷ 0308 × 0300 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 000A ÷ 00AD ÷ # ÷ [0.2] (LF) ÷ [3.1] SOFT HYPHEN (Format) ÷ [0.3] +÷ 000A ÷ 0308 × 00AD ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 000A ÷ 3031 ÷ # ÷ [0.2] (LF) ÷ [3.1] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 3031 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 000A ÷ 24C2 ÷ # ÷ [0.2] (LF) ÷ [3.1] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 24C2 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 000A ÷ 0041 ÷ # ÷ [0.2] (LF) ÷ [3.1] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 0041 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 000A ÷ 003A ÷ # ÷ [0.2] (LF) ÷ [3.1] COLON (MidLetter) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 003A ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 000A ÷ 002C ÷ # ÷ [0.2] (LF) ÷ [3.1] COMMA (MidNum) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 002C ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 000A ÷ 002E ÷ # ÷ [0.2] (LF) ÷ [3.1] FULL STOP (MidNumLet) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 002E ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 000A ÷ 0030 ÷ # ÷ [0.2] (LF) ÷ [3.1] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 0030 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 000A ÷ 005F ÷ # ÷ [0.2] (LF) ÷ [3.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 005F ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 000A ÷ 1F1E6 ÷ # ÷ [0.2] (LF) ÷ [3.1] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 1F1E6 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 000A ÷ 05D0 ÷ # ÷ [0.2] (LF) ÷ [3.1] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 05D0 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 000A ÷ 0022 ÷ # ÷ [0.2] (LF) ÷ [3.1] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 0022 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 000A ÷ 0027 ÷ # ÷ [0.2] (LF) ÷ [3.1] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 0027 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000A ÷ 200D ÷ # ÷ [0.2] (LF) ÷ [3.1] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 000A ÷ 0308 × 200D ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 000A ÷ 00A9 ÷ # ÷ [0.2] (LF) ÷ [3.1] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 00A9 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 000A ÷ 0020 ÷ # ÷ [0.2] (LF) ÷ [3.1] SPACE (WSegSpace) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 0020 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 000A ÷ 0000 ÷ # ÷ [0.2] (LF) ÷ [3.1] (XXmExtPict) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 0000 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 000A ÷ 0061 × 2060 ÷ # ÷ [0.2] (LF) ÷ [3.1] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000A ÷ 0061 ÷ 003A ÷ # ÷ [0.2] (LF) ÷ [3.1] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 000A ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] (LF) ÷ [3.1] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000A ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] (LF) ÷ [3.1] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000A ÷ 0061 ÷ 002C ÷ # ÷ [0.2] (LF) ÷ [3.1] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 000A ÷ 0031 ÷ 003A ÷ # ÷ [0.2] (LF) ÷ [3.1] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 000A ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] (LF) ÷ [3.1] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000A ÷ 0031 ÷ 002C ÷ # ÷ [0.2] (LF) ÷ [3.1] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 000A ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] (LF) ÷ [3.1] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000A ÷ 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000B ÷ 000D ÷ # ÷ [0.2] (Newline) ÷ [3.1] (CR) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 000D ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 000B ÷ 000A ÷ # ÷ [0.2] (Newline) ÷ [3.1] (LF) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 000A ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 000B ÷ 000B ÷ # ÷ [0.2] (Newline) ÷ [3.1] (Newline) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 000B ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 000B ÷ 0300 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 000B ÷ 0308 × 0300 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 000B ÷ 00AD ÷ # ÷ [0.2] (Newline) ÷ [3.1] SOFT HYPHEN (Format) ÷ [0.3] +÷ 000B ÷ 0308 × 00AD ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 000B ÷ 3031 ÷ # ÷ [0.2] (Newline) ÷ [3.1] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 3031 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 000B ÷ 24C2 ÷ # ÷ [0.2] (Newline) ÷ [3.1] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 24C2 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 000B ÷ 0041 ÷ # ÷ [0.2] (Newline) ÷ [3.1] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 0041 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 000B ÷ 003A ÷ # ÷ [0.2] (Newline) ÷ [3.1] COLON (MidLetter) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 003A ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 000B ÷ 002C ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMMA (MidNum) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 002C ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 000B ÷ 002E ÷ # ÷ [0.2] (Newline) ÷ [3.1] FULL STOP (MidNumLet) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 002E ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 000B ÷ 0030 ÷ # ÷ [0.2] (Newline) ÷ [3.1] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 0030 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 000B ÷ 005F ÷ # ÷ [0.2] (Newline) ÷ [3.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 005F ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 000B ÷ 1F1E6 ÷ # ÷ [0.2] (Newline) ÷ [3.1] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 1F1E6 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 000B ÷ 05D0 ÷ # ÷ [0.2] (Newline) ÷ [3.1] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 05D0 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 000B ÷ 0022 ÷ # ÷ [0.2] (Newline) ÷ [3.1] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 0022 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 000B ÷ 0027 ÷ # ÷ [0.2] (Newline) ÷ [3.1] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 0027 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000B ÷ 200D ÷ # ÷ [0.2] (Newline) ÷ [3.1] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 000B ÷ 0308 × 200D ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 000B ÷ 00A9 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 00A9 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 000B ÷ 0020 ÷ # ÷ [0.2] (Newline) ÷ [3.1] SPACE (WSegSpace) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 0020 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 000B ÷ 0000 ÷ # ÷ [0.2] (Newline) ÷ [3.1] (XXmExtPict) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 0000 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 000B ÷ 0061 × 2060 ÷ # ÷ [0.2] (Newline) ÷ [3.1] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000B ÷ 0061 ÷ 003A ÷ # ÷ [0.2] (Newline) ÷ [3.1] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 000B ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] (Newline) ÷ [3.1] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000B ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] (Newline) ÷ [3.1] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000B ÷ 0061 ÷ 002C ÷ # ÷ [0.2] (Newline) ÷ [3.1] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 000B ÷ 0031 ÷ 003A ÷ # ÷ [0.2] (Newline) ÷ [3.1] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 000B ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] (Newline) ÷ [3.1] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 000B ÷ 0031 ÷ 002C ÷ # ÷ [0.2] (Newline) ÷ [3.1] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 000B ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] (Newline) ÷ [3.1] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000B ÷ 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] (Newline) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0300 ÷ 000D ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0300 × 0308 ÷ 000D ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0300 ÷ 000A ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0300 × 0308 ÷ 000A ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0300 ÷ 000B ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0300 × 0308 ÷ 000B ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0300 × 0300 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0300 × 0308 × 0300 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0300 × 00AD ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0300 × 0308 × 00AD ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0300 ÷ 3031 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0300 × 0308 ÷ 3031 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0300 ÷ 24C2 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0300 × 0308 ÷ 24C2 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0300 ÷ 0041 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0300 × 0308 ÷ 0041 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0300 ÷ 003A ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0300 × 0308 ÷ 003A ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0300 ÷ 002C ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0300 × 0308 ÷ 002C ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0300 ÷ 002E ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0300 × 0308 ÷ 002E ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0300 ÷ 0030 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0300 × 0308 ÷ 0030 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0300 ÷ 005F ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0300 × 0308 ÷ 005F ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0300 ÷ 1F1E6 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0300 × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0300 ÷ 05D0 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0300 × 0308 ÷ 05D0 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0300 ÷ 0022 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0300 × 0308 ÷ 0022 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0300 ÷ 0027 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0300 × 0308 ÷ 0027 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0300 × 200D ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0300 × 0308 × 200D ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0300 ÷ 00A9 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0300 × 0308 ÷ 00A9 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0300 ÷ 0020 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0300 × 0308 ÷ 0020 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0300 ÷ 0000 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0300 × 0308 ÷ 0000 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0300 ÷ 0061 × 2060 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0300 × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0300 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0300 × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0300 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0300 × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0300 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0300 × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0300 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0300 × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0300 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0300 × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0300 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0300 × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0300 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0300 × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0300 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0300 × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] COMBINING GRAVE ACCENT (Extend) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 00AD ÷ 000D ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [3.2] (CR) ÷ [0.3] +÷ 00AD × 0308 ÷ 000D ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 00AD ÷ 000A ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [3.2] (LF) ÷ [0.3] +÷ 00AD × 0308 ÷ 000A ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 00AD ÷ 000B ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [3.2] (Newline) ÷ [0.3] +÷ 00AD × 0308 ÷ 000B ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 00AD × 0300 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 00AD × 0308 × 0300 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 00AD × 00AD ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 00AD × 0308 × 00AD ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 00AD ÷ 3031 ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 00AD × 0308 ÷ 3031 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 00AD ÷ 24C2 ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 00AD × 0308 ÷ 24C2 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 00AD ÷ 0041 ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 00AD × 0308 ÷ 0041 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 00AD ÷ 003A ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 00AD × 0308 ÷ 003A ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 00AD ÷ 002C ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 00AD × 0308 ÷ 002C ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 00AD ÷ 002E ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 00AD × 0308 ÷ 002E ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 00AD ÷ 0030 ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 00AD × 0308 ÷ 0030 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 00AD ÷ 005F ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 00AD × 0308 ÷ 005F ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 00AD ÷ 1F1E6 ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 00AD × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 00AD ÷ 05D0 ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 00AD × 0308 ÷ 05D0 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 00AD ÷ 0022 ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 00AD × 0308 ÷ 0022 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 00AD ÷ 0027 ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 00AD × 0308 ÷ 0027 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 00AD × 200D ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 00AD × 0308 × 200D ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 00AD ÷ 00A9 ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 00AD × 0308 ÷ 00A9 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 00AD ÷ 0020 ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 00AD × 0308 ÷ 0020 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 00AD ÷ 0000 ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 00AD × 0308 ÷ 0000 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 00AD ÷ 0061 × 2060 ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 00AD × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 00AD ÷ 0061 ÷ 003A ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 00AD × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 00AD ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 00AD × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 00AD ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 00AD × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 00AD ÷ 0061 ÷ 002C ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 00AD × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 00AD ÷ 0031 ÷ 003A ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 00AD × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 00AD ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 00AD × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 00AD ÷ 0031 ÷ 002C ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 00AD × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 00AD ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] SOFT HYPHEN (Format) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 00AD × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] SOFT HYPHEN (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 3031 ÷ 000D ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [3.2] (CR) ÷ [0.3] +÷ 3031 × 0308 ÷ 000D ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 3031 ÷ 000A ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [3.2] (LF) ÷ [0.3] +÷ 3031 × 0308 ÷ 000A ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 3031 ÷ 000B ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [3.2] (Newline) ÷ [0.3] +÷ 3031 × 0308 ÷ 000B ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 3031 × 0300 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 3031 × 0308 × 0300 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 3031 × 00AD ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 3031 × 0308 × 00AD ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 3031 × 3031 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [13.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 3031 × 0308 × 3031 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) × [13.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 3031 ÷ 24C2 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 3031 × 0308 ÷ 24C2 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 3031 ÷ 0041 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 3031 × 0308 ÷ 0041 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 3031 ÷ 003A ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 3031 × 0308 ÷ 003A ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 3031 ÷ 002C ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 3031 × 0308 ÷ 002C ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 3031 ÷ 002E ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 3031 × 0308 ÷ 002E ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 3031 ÷ 0030 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 3031 × 0308 ÷ 0030 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 3031 × 005F ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [13.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 3031 × 0308 × 005F ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) × [13.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 3031 ÷ 1F1E6 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 3031 × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 3031 ÷ 05D0 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 3031 × 0308 ÷ 05D0 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 3031 ÷ 0022 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 3031 × 0308 ÷ 0022 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 3031 ÷ 0027 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 3031 × 0308 ÷ 0027 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 3031 × 200D ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 3031 × 0308 × 200D ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 3031 ÷ 00A9 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 3031 × 0308 ÷ 00A9 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 3031 ÷ 0020 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 3031 × 0308 ÷ 0020 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 3031 ÷ 0000 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 3031 × 0308 ÷ 0000 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 3031 ÷ 0061 × 2060 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 3031 × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 3031 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 3031 × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 3031 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 3031 × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 3031 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 3031 × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 3031 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 3031 × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 3031 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 3031 × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 3031 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 3031 × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 3031 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 3031 × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 3031 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 3031 × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 24C2 ÷ 000D ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [3.2] (CR) ÷ [0.3] +÷ 24C2 × 0308 ÷ 000D ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 24C2 ÷ 000A ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [3.2] (LF) ÷ [0.3] +÷ 24C2 × 0308 ÷ 000A ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 24C2 ÷ 000B ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [3.2] (Newline) ÷ [0.3] +÷ 24C2 × 0308 ÷ 000B ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 24C2 × 0300 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 24C2 × 0308 × 0300 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 24C2 × 00AD ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 24C2 × 0308 × 00AD ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 24C2 ÷ 3031 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 24C2 × 0308 ÷ 3031 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 24C2 × 24C2 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [5.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 24C2 × 0308 × 24C2 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 24C2 × 0041 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [5.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 24C2 × 0308 × 0041 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 24C2 ÷ 003A ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 24C2 × 0308 ÷ 003A ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 24C2 ÷ 002C ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 24C2 × 0308 ÷ 002C ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 24C2 ÷ 002E ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 24C2 × 0308 ÷ 002E ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 24C2 × 0030 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [9.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 24C2 × 0308 × 0030 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 24C2 × 005F ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [13.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 24C2 × 0308 × 005F ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [13.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 24C2 ÷ 1F1E6 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 24C2 × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 24C2 × 05D0 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [5.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 24C2 × 0308 × 05D0 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 24C2 ÷ 0022 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 24C2 × 0308 ÷ 0022 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 24C2 ÷ 0027 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 24C2 × 0308 ÷ 0027 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 24C2 × 200D ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 24C2 × 0308 × 200D ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 24C2 ÷ 00A9 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 24C2 × 0308 ÷ 00A9 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 24C2 ÷ 0020 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 24C2 × 0308 ÷ 0020 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 24C2 ÷ 0000 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 24C2 × 0308 ÷ 0000 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 24C2 × 0061 × 2060 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 24C2 × 0308 × 0061 × 2060 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 24C2 × 0061 ÷ 003A ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 24C2 × 0308 × 0061 ÷ 003A ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 24C2 × 0061 ÷ 0027 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 24C2 × 0308 × 0061 ÷ 0027 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 24C2 × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 24C2 × 0308 × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 24C2 × 0061 ÷ 002C ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 24C2 × 0308 × 0061 ÷ 002C ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 24C2 × 0031 ÷ 003A ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 24C2 × 0308 × 0031 ÷ 003A ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 24C2 × 0031 ÷ 0027 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 24C2 × 0308 × 0031 ÷ 0027 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 24C2 × 0031 ÷ 002C ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 24C2 × 0308 × 0031 ÷ 002C ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 24C2 × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 24C2 × 0308 × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0041 ÷ 000D ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [3.2] (CR) ÷ [0.3] +÷ 0041 × 0308 ÷ 000D ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0041 ÷ 000A ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [3.2] (LF) ÷ [0.3] +÷ 0041 × 0308 ÷ 000A ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0041 ÷ 000B ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0041 × 0308 ÷ 000B ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0041 × 0300 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0041 × 0308 × 0300 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0041 × 00AD ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0041 × 0308 × 00AD ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0041 ÷ 3031 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0041 × 0308 ÷ 3031 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0041 × 24C2 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [5.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0041 × 0308 × 24C2 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0041 × 0041 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [5.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0041 × 0308 × 0041 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0041 ÷ 003A ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0041 × 0308 ÷ 003A ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0041 ÷ 002C ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0041 × 0308 ÷ 002C ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0041 ÷ 002E ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0041 × 0308 ÷ 002E ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0041 × 0030 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [9.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0041 × 0308 × 0030 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0041 × 005F ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0041 × 0308 × 005F ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [13.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0041 ÷ 1F1E6 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0041 × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0041 × 05D0 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [5.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0041 × 0308 × 05D0 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0041 ÷ 0022 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0041 × 0308 ÷ 0022 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0041 ÷ 0027 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0041 × 0308 ÷ 0027 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0041 × 200D ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0041 × 0308 × 200D ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0041 ÷ 00A9 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0041 × 0308 ÷ 00A9 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0041 ÷ 0020 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0041 × 0308 ÷ 0020 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0041 ÷ 0000 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0041 × 0308 ÷ 0000 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0041 × 0061 × 2060 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0041 × 0308 × 0061 × 2060 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0041 × 0061 ÷ 003A ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0041 × 0308 × 0061 ÷ 003A ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0041 × 0061 ÷ 0027 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0041 × 0308 × 0061 ÷ 0027 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0041 × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0041 × 0308 × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0041 × 0061 ÷ 002C ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0041 × 0308 × 0061 ÷ 002C ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0041 × 0031 ÷ 003A ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0041 × 0308 × 0031 ÷ 003A ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0041 × 0031 ÷ 0027 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0041 × 0308 × 0031 ÷ 0027 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0041 × 0031 ÷ 002C ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0041 × 0308 × 0031 ÷ 002C ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0041 × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0041 × 0308 × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 003A ÷ 000D ÷ # ÷ [0.2] COLON (MidLetter) ÷ [3.2] (CR) ÷ [0.3] +÷ 003A × 0308 ÷ 000D ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 003A ÷ 000A ÷ # ÷ [0.2] COLON (MidLetter) ÷ [3.2] (LF) ÷ [0.3] +÷ 003A × 0308 ÷ 000A ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 003A ÷ 000B ÷ # ÷ [0.2] COLON (MidLetter) ÷ [3.2] (Newline) ÷ [0.3] +÷ 003A × 0308 ÷ 000B ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 003A × 0300 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 003A × 0308 × 0300 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 003A × 00AD ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 003A × 0308 × 00AD ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 003A ÷ 3031 ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 003A × 0308 ÷ 3031 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 003A ÷ 24C2 ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 003A × 0308 ÷ 24C2 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 003A ÷ 0041 ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 003A × 0308 ÷ 0041 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 003A ÷ 003A ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 003A × 0308 ÷ 003A ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 003A ÷ 002C ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 003A × 0308 ÷ 002C ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 003A ÷ 002E ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 003A × 0308 ÷ 002E ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 003A ÷ 0030 ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 003A × 0308 ÷ 0030 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 003A ÷ 005F ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 003A × 0308 ÷ 005F ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 003A ÷ 1F1E6 ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 003A × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 003A ÷ 05D0 ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 003A × 0308 ÷ 05D0 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 003A ÷ 0022 ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 003A × 0308 ÷ 0022 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 003A ÷ 0027 ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 003A × 0308 ÷ 0027 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 003A × 200D ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 003A × 0308 × 200D ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 003A ÷ 00A9 ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 003A × 0308 ÷ 00A9 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 003A ÷ 0020 ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 003A × 0308 ÷ 0020 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 003A ÷ 0000 ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 003A × 0308 ÷ 0000 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 003A ÷ 0061 × 2060 ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 003A × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 003A ÷ 0061 ÷ 003A ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 003A × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 003A ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 003A × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 003A ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 003A × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 003A ÷ 0061 ÷ 002C ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 003A × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 003A ÷ 0031 ÷ 003A ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 003A × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 003A ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 003A × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 003A ÷ 0031 ÷ 002C ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 003A × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 003A ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 003A × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 002C ÷ 000D ÷ # ÷ [0.2] COMMA (MidNum) ÷ [3.2] (CR) ÷ [0.3] +÷ 002C × 0308 ÷ 000D ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 002C ÷ 000A ÷ # ÷ [0.2] COMMA (MidNum) ÷ [3.2] (LF) ÷ [0.3] +÷ 002C × 0308 ÷ 000A ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 002C ÷ 000B ÷ # ÷ [0.2] COMMA (MidNum) ÷ [3.2] (Newline) ÷ [0.3] +÷ 002C × 0308 ÷ 000B ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 002C × 0300 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 002C × 0308 × 0300 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 002C × 00AD ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 002C × 0308 × 00AD ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 002C ÷ 3031 ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 002C × 0308 ÷ 3031 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 002C ÷ 24C2 ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 002C × 0308 ÷ 24C2 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 002C ÷ 0041 ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 002C × 0308 ÷ 0041 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 002C ÷ 003A ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 002C × 0308 ÷ 003A ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 002C ÷ 002C ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 002C × 0308 ÷ 002C ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 002C ÷ 002E ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 002C × 0308 ÷ 002E ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 002C ÷ 0030 ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 002C × 0308 ÷ 0030 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 002C ÷ 005F ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 002C × 0308 ÷ 005F ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 002C ÷ 1F1E6 ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 002C × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 002C ÷ 05D0 ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 002C × 0308 ÷ 05D0 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 002C ÷ 0022 ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 002C × 0308 ÷ 0022 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 002C ÷ 0027 ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 002C × 0308 ÷ 0027 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 002C × 200D ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 002C × 0308 × 200D ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 002C ÷ 00A9 ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 002C × 0308 ÷ 00A9 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 002C ÷ 0020 ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 002C × 0308 ÷ 0020 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 002C ÷ 0000 ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 002C × 0308 ÷ 0000 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 002C ÷ 0061 × 2060 ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 002C × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 002C ÷ 0061 ÷ 003A ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 002C × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 002C ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 002C × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 002C ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 002C × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 002C ÷ 0061 ÷ 002C ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 002C × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 002C ÷ 0031 ÷ 003A ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 002C × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 002C ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 002C × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 002C ÷ 0031 ÷ 002C ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 002C × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 002C ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 002C × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 002E ÷ 000D ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [3.2] (CR) ÷ [0.3] +÷ 002E × 0308 ÷ 000D ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 002E ÷ 000A ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [3.2] (LF) ÷ [0.3] +÷ 002E × 0308 ÷ 000A ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 002E ÷ 000B ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [3.2] (Newline) ÷ [0.3] +÷ 002E × 0308 ÷ 000B ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 002E × 0300 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 002E × 0308 × 0300 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 002E × 00AD ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 002E × 0308 × 00AD ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 002E ÷ 3031 ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 002E × 0308 ÷ 3031 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 002E ÷ 24C2 ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 002E × 0308 ÷ 24C2 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 002E ÷ 0041 ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 002E × 0308 ÷ 0041 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 002E ÷ 003A ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 002E × 0308 ÷ 003A ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 002E ÷ 002C ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 002E × 0308 ÷ 002C ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 002E ÷ 002E ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 002E × 0308 ÷ 002E ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 002E ÷ 0030 ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 002E × 0308 ÷ 0030 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 002E ÷ 005F ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 002E × 0308 ÷ 005F ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 002E ÷ 1F1E6 ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 002E × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 002E ÷ 05D0 ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 002E × 0308 ÷ 05D0 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 002E ÷ 0022 ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 002E × 0308 ÷ 0022 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 002E ÷ 0027 ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 002E × 0308 ÷ 0027 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 002E × 200D ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 002E × 0308 × 200D ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 002E ÷ 00A9 ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 002E × 0308 ÷ 00A9 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 002E ÷ 0020 ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 002E × 0308 ÷ 0020 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 002E ÷ 0000 ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 002E × 0308 ÷ 0000 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 002E ÷ 0061 × 2060 ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 002E × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 002E ÷ 0061 ÷ 003A ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 002E × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 002E ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 002E × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 002E ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 002E × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 002E ÷ 0061 ÷ 002C ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 002E × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 002E ÷ 0031 ÷ 003A ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 002E × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 002E ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 002E × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 002E ÷ 0031 ÷ 002C ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 002E × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 002E ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 002E × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] FULL STOP (MidNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0030 ÷ 000D ÷ # ÷ [0.2] DIGIT ZERO (Numeric) ÷ [3.2] (CR) ÷ [0.3] +÷ 0030 × 0308 ÷ 000D ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0030 ÷ 000A ÷ # ÷ [0.2] DIGIT ZERO (Numeric) ÷ [3.2] (LF) ÷ [0.3] +÷ 0030 × 0308 ÷ 000A ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0030 ÷ 000B ÷ # ÷ [0.2] DIGIT ZERO (Numeric) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0030 × 0308 ÷ 000B ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0030 × 0300 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0030 × 0308 × 0300 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0030 × 00AD ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0030 × 0308 × 00AD ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0030 ÷ 3031 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0030 × 0308 ÷ 3031 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0030 × 24C2 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [10.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0030 × 0308 × 24C2 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [10.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0030 × 0041 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [10.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0030 × 0308 × 0041 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [10.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0030 ÷ 003A ÷ # ÷ [0.2] DIGIT ZERO (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0030 × 0308 ÷ 003A ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0030 ÷ 002C ÷ # ÷ [0.2] DIGIT ZERO (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0030 × 0308 ÷ 002C ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0030 ÷ 002E ÷ # ÷ [0.2] DIGIT ZERO (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0030 × 0308 ÷ 002E ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0030 × 0030 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [8.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0030 × 0308 × 0030 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [8.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0030 × 005F ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [13.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0030 × 0308 × 005F ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [13.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0030 ÷ 1F1E6 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0030 × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0030 × 05D0 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [10.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0030 × 0308 × 05D0 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [10.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0030 ÷ 0022 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0030 × 0308 ÷ 0022 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0030 ÷ 0027 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0030 × 0308 ÷ 0027 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0030 × 200D ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0030 × 0308 × 200D ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0030 ÷ 00A9 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0030 × 0308 ÷ 00A9 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0030 ÷ 0020 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0030 × 0308 ÷ 0020 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0030 ÷ 0000 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0030 × 0308 ÷ 0000 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0030 × 0061 × 2060 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [10.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0030 × 0308 × 0061 × 2060 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [10.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0030 × 0061 ÷ 003A ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [10.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0030 × 0308 × 0061 ÷ 003A ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [10.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0030 × 0061 ÷ 0027 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [10.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0030 × 0308 × 0061 ÷ 0027 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [10.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0030 × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [10.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0030 × 0308 × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [10.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0030 × 0061 ÷ 002C ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [10.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0030 × 0308 × 0061 ÷ 002C ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [10.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0030 × 0031 ÷ 003A ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [8.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0030 × 0308 × 0031 ÷ 003A ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [8.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0030 × 0031 ÷ 0027 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [8.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0030 × 0308 × 0031 ÷ 0027 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [8.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0030 × 0031 ÷ 002C ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [8.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0030 × 0308 × 0031 ÷ 002C ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [8.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0030 × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [8.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0030 × 0308 × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [4.0] COMBINING DIAERESIS (Extend) × [8.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 005F ÷ 000D ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) ÷ [3.2] (CR) ÷ [0.3] +÷ 005F × 0308 ÷ 000D ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 005F ÷ 000A ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) ÷ [3.2] (LF) ÷ [0.3] +÷ 005F × 0308 ÷ 000A ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 005F ÷ 000B ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) ÷ [3.2] (Newline) ÷ [0.3] +÷ 005F × 0308 ÷ 000B ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 005F × 0300 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 005F × 0308 × 0300 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 005F × 00AD ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 005F × 0308 × 00AD ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 005F × 3031 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [13.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 005F × 0308 × 3031 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [13.2] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 005F × 24C2 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [13.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 005F × 0308 × 24C2 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [13.2] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 005F × 0041 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [13.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 005F × 0308 × 0041 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [13.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 005F ÷ 003A ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 005F × 0308 ÷ 003A ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 005F ÷ 002C ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 005F × 0308 ÷ 002C ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 005F ÷ 002E ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 005F × 0308 ÷ 002E ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 005F × 0030 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [13.2] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 005F × 0308 × 0030 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [13.2] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 005F × 005F ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [13.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 005F × 0308 × 005F ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [13.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 005F ÷ 1F1E6 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 005F × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 005F × 05D0 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [13.2] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 005F × 0308 × 05D0 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [13.2] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 005F ÷ 0022 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 005F × 0308 ÷ 0022 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 005F ÷ 0027 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 005F × 0308 ÷ 0027 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 005F × 200D ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 005F × 0308 × 200D ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 005F ÷ 00A9 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 005F × 0308 ÷ 00A9 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 005F ÷ 0020 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 005F × 0308 ÷ 0020 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 005F ÷ 0000 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 005F × 0308 ÷ 0000 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 005F × 0061 × 2060 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 005F × 0308 × 0061 × 2060 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 005F × 0061 ÷ 003A ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 005F × 0308 × 0061 ÷ 003A ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 005F × 0061 ÷ 0027 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 005F × 0308 × 0061 ÷ 0027 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 005F × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 005F × 0308 × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 005F × 0061 ÷ 002C ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 005F × 0308 × 0061 ÷ 002C ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 005F × 0031 ÷ 003A ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 005F × 0308 × 0031 ÷ 003A ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 005F × 0031 ÷ 0027 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 005F × 0308 × 0031 ÷ 0027 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 005F × 0031 ÷ 002C ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 005F × 0308 × 0031 ÷ 002C ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 005F × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 005F × 0308 × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] LOW LINE (ExtendNumLet) × [4.0] COMBINING DIAERESIS (Extend) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 1F1E6 ÷ 000D ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [3.2] (CR) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 000D ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 1F1E6 ÷ 000A ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [3.2] (LF) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 000A ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 1F1E6 ÷ 000B ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [3.2] (Newline) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 000B ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 1F1E6 × 0300 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 1F1E6 × 0308 × 0300 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 1F1E6 × 00AD ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 1F1E6 × 0308 × 00AD ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 1F1E6 ÷ 3031 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 3031 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 1F1E6 ÷ 24C2 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 24C2 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 1F1E6 ÷ 0041 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 0041 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 1F1E6 ÷ 003A ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 003A ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 1F1E6 ÷ 002C ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 002C ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 1F1E6 ÷ 002E ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 002E ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 1F1E6 ÷ 0030 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 0030 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 1F1E6 ÷ 005F ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 005F ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 1F1E6 × 1F1E6 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [15.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 1F1E6 × 0308 × 1F1E6 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) × [15.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 1F1E6 ÷ 05D0 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 05D0 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 1F1E6 ÷ 0022 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 0022 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 1F1E6 ÷ 0027 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 0027 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 1F1E6 × 200D ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 1F1E6 × 0308 × 200D ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 1F1E6 ÷ 00A9 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 00A9 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 1F1E6 ÷ 0020 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 0020 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 1F1E6 ÷ 0000 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 0000 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 1F1E6 ÷ 0061 × 2060 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 1F1E6 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 1F1E6 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 1F1E6 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 1F1E6 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 1F1E6 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 1F1E6 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 1F1E6 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 1F1E6 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 1F1E6 × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 05D0 ÷ 000D ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [3.2] (CR) ÷ [0.3] +÷ 05D0 × 0308 ÷ 000D ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 05D0 ÷ 000A ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [3.2] (LF) ÷ [0.3] +÷ 05D0 × 0308 ÷ 000A ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 05D0 ÷ 000B ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [3.2] (Newline) ÷ [0.3] +÷ 05D0 × 0308 ÷ 000B ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 05D0 × 0300 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 05D0 × 0308 × 0300 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 05D0 × 00AD ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 05D0 × 0308 × 00AD ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 05D0 ÷ 3031 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 05D0 × 0308 ÷ 3031 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 05D0 × 24C2 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [5.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 05D0 × 0308 × 24C2 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 05D0 × 0041 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [5.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 05D0 × 0308 × 0041 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 05D0 ÷ 003A ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 05D0 × 0308 ÷ 003A ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 05D0 ÷ 002C ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 05D0 × 0308 ÷ 002C ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 05D0 ÷ 002E ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 05D0 × 0308 ÷ 002E ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 05D0 × 0030 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [9.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 05D0 × 0308 × 0030 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 05D0 × 005F ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [13.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 05D0 × 0308 × 005F ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [13.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 05D0 ÷ 1F1E6 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 05D0 × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 05D0 × 05D0 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [5.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 05D0 × 0308 × 05D0 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 05D0 ÷ 0022 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 05D0 × 0308 ÷ 0022 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 05D0 × 0027 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [7.1] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 05D0 × 0308 × 0027 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [7.1] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 05D0 × 200D ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 05D0 × 0308 × 200D ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 05D0 ÷ 00A9 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 05D0 × 0308 ÷ 00A9 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 05D0 ÷ 0020 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 05D0 × 0308 ÷ 0020 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 05D0 ÷ 0000 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 05D0 × 0308 ÷ 0000 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 05D0 × 0061 × 2060 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 05D0 × 0308 × 0061 × 2060 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 05D0 × 0061 ÷ 003A ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 05D0 × 0308 × 0061 ÷ 003A ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 05D0 × 0061 ÷ 0027 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 05D0 × 0308 × 0061 ÷ 0027 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 05D0 × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 05D0 × 0308 × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 05D0 × 0061 ÷ 002C ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 05D0 × 0308 × 0061 ÷ 002C ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 05D0 × 0031 ÷ 003A ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 05D0 × 0308 × 0031 ÷ 003A ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 05D0 × 0031 ÷ 0027 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 05D0 × 0308 × 0031 ÷ 0027 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 05D0 × 0031 ÷ 002C ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 05D0 × 0308 × 0031 ÷ 002C ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 05D0 × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 05D0 × 0308 × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0022 ÷ 000D ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [3.2] (CR) ÷ [0.3] +÷ 0022 × 0308 ÷ 000D ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0022 ÷ 000A ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [3.2] (LF) ÷ [0.3] +÷ 0022 × 0308 ÷ 000A ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0022 ÷ 000B ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0022 × 0308 ÷ 000B ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0022 × 0300 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0022 × 0308 × 0300 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0022 × 00AD ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0022 × 0308 × 00AD ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0022 ÷ 3031 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0022 × 0308 ÷ 3031 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0022 ÷ 24C2 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0022 × 0308 ÷ 24C2 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0022 ÷ 0041 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0022 × 0308 ÷ 0041 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0022 ÷ 003A ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0022 × 0308 ÷ 003A ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0022 ÷ 002C ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0022 × 0308 ÷ 002C ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0022 ÷ 002E ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0022 × 0308 ÷ 002E ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0022 ÷ 0030 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0022 × 0308 ÷ 0030 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0022 ÷ 005F ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0022 × 0308 ÷ 005F ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0022 ÷ 1F1E6 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0022 × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0022 ÷ 05D0 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0022 × 0308 ÷ 05D0 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0022 ÷ 0022 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0022 × 0308 ÷ 0022 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0022 ÷ 0027 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0022 × 0308 ÷ 0027 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0022 × 200D ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0022 × 0308 × 200D ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0022 ÷ 00A9 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0022 × 0308 ÷ 00A9 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0022 ÷ 0020 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0022 × 0308 ÷ 0020 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0022 ÷ 0000 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0022 × 0308 ÷ 0000 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0022 ÷ 0061 × 2060 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0022 × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0022 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0022 × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0022 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0022 × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0022 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0022 × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0022 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0022 × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0022 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0022 × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0022 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0022 × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0022 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0022 × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0022 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0022 × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] QUOTATION MARK (Double_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0027 ÷ 000D ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [3.2] (CR) ÷ [0.3] +÷ 0027 × 0308 ÷ 000D ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0027 ÷ 000A ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [3.2] (LF) ÷ [0.3] +÷ 0027 × 0308 ÷ 000A ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0027 ÷ 000B ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0027 × 0308 ÷ 000B ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0027 × 0300 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0027 × 0308 × 0300 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0027 × 00AD ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0027 × 0308 × 00AD ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0027 ÷ 3031 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0027 × 0308 ÷ 3031 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0027 ÷ 24C2 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0027 × 0308 ÷ 24C2 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0027 ÷ 0041 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0027 × 0308 ÷ 0041 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0027 ÷ 003A ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0027 × 0308 ÷ 003A ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0027 ÷ 002C ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0027 × 0308 ÷ 002C ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0027 ÷ 002E ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0027 × 0308 ÷ 002E ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0027 ÷ 0030 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0027 × 0308 ÷ 0030 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0027 ÷ 005F ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0027 × 0308 ÷ 005F ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0027 ÷ 1F1E6 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0027 × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0027 ÷ 05D0 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0027 × 0308 ÷ 05D0 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0027 ÷ 0022 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0027 × 0308 ÷ 0022 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0027 ÷ 0027 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0027 × 0308 ÷ 0027 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0027 × 200D ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0027 × 0308 × 200D ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0027 ÷ 00A9 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0027 × 0308 ÷ 00A9 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0027 ÷ 0020 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0027 × 0308 ÷ 0020 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0027 ÷ 0000 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0027 × 0308 ÷ 0000 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0027 ÷ 0061 × 2060 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0027 × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0027 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0027 × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0027 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0027 × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0027 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0027 × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0027 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0027 × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0027 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0027 × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0027 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0027 × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0027 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0027 × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0027 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0027 × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 200D ÷ 000D ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [3.2] (CR) ÷ [0.3] +÷ 200D × 0308 ÷ 000D ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 200D ÷ 000A ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [3.2] (LF) ÷ [0.3] +÷ 200D × 0308 ÷ 000A ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 200D ÷ 000B ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [3.2] (Newline) ÷ [0.3] +÷ 200D × 0308 ÷ 000B ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 200D × 0300 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 200D × 0308 × 0300 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 200D × 00AD ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 200D × 0308 × 00AD ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 200D ÷ 3031 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 200D × 0308 ÷ 3031 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 200D × 24C2 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [3.3] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 200D × 0308 ÷ 24C2 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 200D ÷ 0041 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 200D × 0308 ÷ 0041 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 200D ÷ 003A ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 200D × 0308 ÷ 003A ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 200D ÷ 002C ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 200D × 0308 ÷ 002C ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 200D ÷ 002E ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 200D × 0308 ÷ 002E ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 200D ÷ 0030 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 200D × 0308 ÷ 0030 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 200D ÷ 005F ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 200D × 0308 ÷ 005F ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 200D ÷ 1F1E6 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 200D × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 200D ÷ 05D0 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 200D × 0308 ÷ 05D0 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 200D ÷ 0022 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 200D × 0308 ÷ 0022 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 200D ÷ 0027 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 200D × 0308 ÷ 0027 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 200D × 200D ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 200D × 0308 × 200D ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 200D × 00A9 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [3.3] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 200D × 0308 ÷ 00A9 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 200D ÷ 0020 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 200D × 0308 ÷ 0020 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 200D ÷ 0000 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 200D × 0308 ÷ 0000 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 200D ÷ 0061 × 2060 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 200D × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 200D ÷ 0061 ÷ 003A ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 200D × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 200D ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 200D × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 200D ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 200D × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 200D ÷ 0061 ÷ 002C ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 200D × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 200D ÷ 0031 ÷ 003A ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 200D × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 200D ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 200D × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 200D ÷ 0031 ÷ 002C ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 200D × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 200D ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 200D × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 00A9 ÷ 000D ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [3.2] (CR) ÷ [0.3] +÷ 00A9 × 0308 ÷ 000D ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 00A9 ÷ 000A ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [3.2] (LF) ÷ [0.3] +÷ 00A9 × 0308 ÷ 000A ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 00A9 ÷ 000B ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [3.2] (Newline) ÷ [0.3] +÷ 00A9 × 0308 ÷ 000B ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 00A9 × 0300 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 00A9 × 0308 × 0300 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 00A9 × 00AD ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 00A9 × 0308 × 00AD ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 00A9 ÷ 3031 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 00A9 × 0308 ÷ 3031 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 00A9 ÷ 24C2 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 00A9 × 0308 ÷ 24C2 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 00A9 ÷ 0041 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 00A9 × 0308 ÷ 0041 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 00A9 ÷ 003A ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 00A9 × 0308 ÷ 003A ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 00A9 ÷ 002C ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 00A9 × 0308 ÷ 002C ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 00A9 ÷ 002E ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 00A9 × 0308 ÷ 002E ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 00A9 ÷ 0030 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 00A9 × 0308 ÷ 0030 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 00A9 ÷ 005F ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 00A9 × 0308 ÷ 005F ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 00A9 ÷ 1F1E6 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 00A9 × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 00A9 ÷ 05D0 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 00A9 × 0308 ÷ 05D0 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 00A9 ÷ 0022 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 00A9 × 0308 ÷ 0022 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 00A9 ÷ 0027 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 00A9 × 0308 ÷ 0027 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 00A9 × 200D ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 00A9 × 0308 × 200D ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 00A9 ÷ 00A9 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 00A9 × 0308 ÷ 00A9 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 00A9 ÷ 0020 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 00A9 × 0308 ÷ 0020 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 00A9 ÷ 0000 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 00A9 × 0308 ÷ 0000 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 00A9 ÷ 0061 × 2060 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 00A9 × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 00A9 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 00A9 × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 00A9 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 00A9 × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 00A9 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 00A9 × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 00A9 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 00A9 × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 00A9 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 00A9 × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 00A9 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 00A9 × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 00A9 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 00A9 × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 00A9 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 00A9 × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] COPYRIGHT SIGN (ExtPictmALetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0020 ÷ 000D ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [3.2] (CR) ÷ [0.3] +÷ 0020 × 0308 ÷ 000D ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0020 ÷ 000A ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [3.2] (LF) ÷ [0.3] +÷ 0020 × 0308 ÷ 000A ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0020 ÷ 000B ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0020 × 0308 ÷ 000B ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0020 × 0300 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0020 × 0308 × 0300 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0020 × 00AD ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0020 × 0308 × 00AD ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0020 ÷ 3031 ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0020 × 0308 ÷ 3031 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0020 ÷ 24C2 ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0020 × 0308 ÷ 24C2 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0020 ÷ 0041 ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0020 × 0308 ÷ 0041 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0020 ÷ 003A ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0020 × 0308 ÷ 003A ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0020 ÷ 002C ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0020 × 0308 ÷ 002C ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0020 ÷ 002E ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0020 × 0308 ÷ 002E ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0020 ÷ 0030 ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0020 × 0308 ÷ 0030 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0020 ÷ 005F ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0020 × 0308 ÷ 005F ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0020 ÷ 1F1E6 ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0020 × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0020 ÷ 05D0 ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0020 × 0308 ÷ 05D0 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0020 ÷ 0022 ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0020 × 0308 ÷ 0022 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0020 ÷ 0027 ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0020 × 0308 ÷ 0027 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0020 × 200D ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0020 × 0308 × 200D ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0020 ÷ 00A9 ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0020 × 0308 ÷ 00A9 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0020 × 0020 ÷ # ÷ [0.2] SPACE (WSegSpace) × [3.4] SPACE (WSegSpace) ÷ [0.3] +÷ 0020 × 0308 ÷ 0020 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0020 ÷ 0000 ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0020 × 0308 ÷ 0000 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0020 ÷ 0061 × 2060 ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0020 × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0020 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0020 × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0020 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0020 × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0020 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0020 × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0020 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0020 × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0020 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0020 × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0020 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0020 × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0020 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0020 × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0020 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] SPACE (WSegSpace) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0020 × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0000 ÷ 000D ÷ # ÷ [0.2] (XXmExtPict) ÷ [3.2] (CR) ÷ [0.3] +÷ 0000 × 0308 ÷ 000D ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0000 ÷ 000A ÷ # ÷ [0.2] (XXmExtPict) ÷ [3.2] (LF) ÷ [0.3] +÷ 0000 × 0308 ÷ 000A ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0000 ÷ 000B ÷ # ÷ [0.2] (XXmExtPict) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0000 × 0308 ÷ 000B ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0000 × 0300 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0000 × 0308 × 0300 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0000 × 00AD ÷ # ÷ [0.2] (XXmExtPict) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0000 × 0308 × 00AD ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0000 ÷ 3031 ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0000 × 0308 ÷ 3031 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0000 ÷ 24C2 ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0000 × 0308 ÷ 24C2 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0000 ÷ 0041 ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0000 × 0308 ÷ 0041 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0000 ÷ 003A ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0000 × 0308 ÷ 003A ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0000 ÷ 002C ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0000 × 0308 ÷ 002C ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0000 ÷ 002E ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0000 × 0308 ÷ 002E ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0000 ÷ 0030 ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0000 × 0308 ÷ 0030 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0000 ÷ 005F ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0000 × 0308 ÷ 005F ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0000 ÷ 1F1E6 ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0000 × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0000 ÷ 05D0 ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0000 × 0308 ÷ 05D0 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0000 ÷ 0022 ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0000 × 0308 ÷ 0022 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0000 ÷ 0027 ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0000 × 0308 ÷ 0027 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0000 × 200D ÷ # ÷ [0.2] (XXmExtPict) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0000 × 0308 × 200D ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0000 ÷ 00A9 ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0000 × 0308 ÷ 00A9 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0000 ÷ 0020 ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0000 × 0308 ÷ 0020 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0000 ÷ 0000 ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0000 × 0308 ÷ 0000 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0000 ÷ 0061 × 2060 ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0000 × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0000 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0000 × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0000 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0000 × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0000 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0000 × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0000 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0000 × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0000 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0000 × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0000 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0000 × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0000 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0000 × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0000 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] (XXmExtPict) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0000 × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] (XXmExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 2060 ÷ 000D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [3.2] (CR) ÷ [0.3] +÷ 0061 × 2060 × 0308 ÷ 000D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0061 × 2060 ÷ 000A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [3.2] (LF) ÷ [0.3] +÷ 0061 × 2060 × 0308 ÷ 000A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0061 × 2060 ÷ 000B ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0061 × 2060 × 0308 ÷ 000B ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0061 × 2060 × 0300 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 0300 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0061 × 2060 × 00AD ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 00AD ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0061 × 2060 ÷ 3031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0061 × 2060 × 0308 ÷ 3031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0061 × 2060 × 24C2 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [5.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 24C2 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0061 × 2060 × 0041 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [5.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 0041 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 2060 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 × 2060 × 0308 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 × 2060 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 × 2060 × 0308 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 × 2060 ÷ 002E ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0061 × 2060 × 0308 ÷ 002E ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0061 × 2060 × 0030 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [9.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 0030 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0061 × 2060 × 005F ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [13.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 005F ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [13.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0061 × 2060 ÷ 1F1E6 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0061 × 2060 × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0061 × 2060 × 05D0 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [5.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 05D0 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0061 × 2060 ÷ 0022 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0061 × 2060 × 0308 ÷ 0022 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0061 × 2060 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 × 2060 × 0308 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 × 2060 × 200D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 200D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0061 × 2060 ÷ 00A9 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0061 × 2060 × 0308 ÷ 00A9 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0061 × 2060 ÷ 0020 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0061 × 2060 × 0308 ÷ 0020 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0061 × 2060 ÷ 0000 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0061 × 2060 × 0308 ÷ 0000 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0061 × 2060 × 0061 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 0061 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 2060 × 0061 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 0061 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 × 2060 × 0061 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 0061 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 × 2060 × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 2060 × 0061 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 0061 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 × 2060 × 0031 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 0031 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 × 2060 × 0031 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 0031 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 × 2060 × 0031 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 0031 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 × 2060 × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 2060 × 0308 × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [9.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 000D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [3.2] (CR) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 000D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 000A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [3.2] (LF) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 000A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 000B ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 000B ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0061 ÷ 003A × 0300 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 × 0300 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0061 ÷ 003A × 00AD ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 × 00AD ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 3031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 3031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0061 × 003A × 24C2 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [7.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0061 × 003A × 0308 × 24C2 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0061 × 003A × 0041 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [7.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 003A × 0308 × 0041 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 002E ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 002E ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 0030 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 0030 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 005F ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 005F ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 1F1E6 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0061 × 003A × 05D0 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [7.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0061 × 003A × 0308 × 05D0 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 0022 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 0022 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 003A × 200D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 × 200D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 00A9 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 00A9 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 0020 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 0020 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 0000 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 0000 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0061 × 003A × 0061 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 003A × 0308 × 0061 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 003A × 0061 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 × 003A × 0308 × 0061 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 × 003A × 0061 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 × 003A × 0308 × 0061 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 × 003A × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 003A × 0308 × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 003A × 0061 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 × 003A × 0308 × 0061 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 0031 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 0031 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 ÷ 003A × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 000D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [3.2] (CR) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 000D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 000A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [3.2] (LF) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 000A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 000B ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 000B ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0061 ÷ 0027 × 0300 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 × 0300 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0061 ÷ 0027 × 00AD ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 × 00AD ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 3031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 3031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0061 × 0027 × 24C2 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [7.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0061 × 0027 × 0308 × 24C2 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0061 × 0027 × 0041 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [7.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 0027 × 0308 × 0041 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 002E ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 002E ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 0030 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 0030 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 005F ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 005F ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 1F1E6 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0061 × 0027 × 05D0 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [7.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0061 × 0027 × 0308 × 05D0 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 0022 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 0022 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 0027 × 200D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 × 200D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 00A9 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 00A9 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 0020 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 0020 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 0000 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 0000 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0061 × 0027 × 0061 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 0027 × 0308 × 0061 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 0027 × 0061 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 × 0027 × 0308 × 0061 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 × 0027 × 0061 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 × 0027 × 0308 × 0061 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 × 0027 × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 0027 × 0308 × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 0027 × 0061 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 × 0027 × 0308 × 0061 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 0027 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 ÷ 0027 × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 000D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [3.2] (CR) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 000D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 000A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [3.2] (LF) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 000A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 000B ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 000B ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0300 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 × 0300 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 00AD ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 × 00AD ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 3031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 3031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0061 × 0027 × 2060 × 24C2 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [7.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0061 × 0027 × 2060 × 0308 × 24C2 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0061 × 0027 × 2060 × 0041 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [7.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 0027 × 2060 × 0308 × 0041 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 002E ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 002E ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 0030 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 0030 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 005F ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 005F ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 1F1E6 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0061 × 0027 × 2060 × 05D0 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [7.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0061 × 0027 × 2060 × 0308 × 05D0 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 0022 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 0022 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 200D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 × 200D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 00A9 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 00A9 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 0020 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 0020 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 0000 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 0000 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0061 × 0027 × 2060 × 0061 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 0027 × 2060 × 0308 × 0061 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 0027 × 2060 × 0061 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 × 0027 × 2060 × 0308 × 0061 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 × 0027 × 2060 × 0061 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 × 0027 × 2060 × 0308 × 0061 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 × 0027 × 2060 × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 0027 × 2060 × 0308 × 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 × 0027 × 2060 × 0061 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 × 0027 × 2060 × 0308 × 0061 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [6.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [7.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 ÷ 0027 × 2060 × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 000D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [3.2] (CR) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 000D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 000A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [3.2] (LF) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 000A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 000B ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 000B ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0061 ÷ 002C × 0300 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 × 0300 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0061 ÷ 002C × 00AD ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 × 00AD ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 3031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 3031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 24C2 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 24C2 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 0041 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 0041 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 002E ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 002E ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 0030 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 0030 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 005F ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 005F ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 1F1E6 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 05D0 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 05D0 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 0022 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 0022 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 002C × 200D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 × 200D ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 00A9 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 00A9 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 0020 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 0020 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 0000 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 0000 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 0061 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 0061 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 0061 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 0031 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 0031 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0061 ÷ 002C × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 000D ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [3.2] (CR) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 000D ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 000A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [3.2] (LF) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 000A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 000B ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 000B ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0031 ÷ 003A × 0300 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 × 0300 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0031 ÷ 003A × 00AD ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 × 00AD ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 3031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 3031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 24C2 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 24C2 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 0041 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 0041 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 002E ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 002E ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 0030 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 0030 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 005F ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 005F ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 1F1E6 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 05D0 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 05D0 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 0022 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 0022 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 003A × 200D ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 × 200D ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 00A9 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 00A9 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 0020 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 0020 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 0000 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 0000 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 0061 × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 0061 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 0061 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 0031 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 0031 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 0031 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 0031 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 0031 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 003A × 0308 ÷ 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 000D ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [3.2] (CR) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 000D ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 000A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [3.2] (LF) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 000A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 000B ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 000B ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0031 ÷ 0027 × 0300 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 × 0300 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0031 ÷ 0027 × 00AD ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 × 00AD ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 3031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 3031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 24C2 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 24C2 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 0041 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 0041 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 002E ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 002E ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0031 × 0027 × 0030 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] APOSTROPHE (Single_Quote) × [11.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0031 × 0027 × 0308 × 0030 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [11.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 005F ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 005F ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 1F1E6 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 05D0 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 05D0 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 0022 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 0022 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 0027 × 200D ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 × 200D ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 00A9 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 00A9 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 0020 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 0020 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 0000 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 0000 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 0061 × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 0027 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 ÷ 0027 × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 × 0027 × 0031 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] APOSTROPHE (Single_Quote) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 × 0027 × 0308 × 0031 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 × 0027 × 0031 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] APOSTROPHE (Single_Quote) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 × 0027 × 0308 × 0031 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 × 0027 × 0031 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] APOSTROPHE (Single_Quote) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 × 0027 × 0308 × 0031 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 × 0027 × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] APOSTROPHE (Single_Quote) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 × 0027 × 0308 × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] APOSTROPHE (Single_Quote) × [4.0] COMBINING DIAERESIS (Extend) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 000D ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [3.2] (CR) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 000D ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 000A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [3.2] (LF) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 000A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 000B ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 000B ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0031 ÷ 002C × 0300 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 × 0300 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0031 ÷ 002C × 00AD ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 × 00AD ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 3031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 3031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 24C2 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 24C2 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 0041 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 0041 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 002E ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 002E ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0031 × 002C × 0030 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] COMMA (MidNum) × [11.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0031 × 002C × 0308 × 0030 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) × [11.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 005F ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 005F ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 1F1E6 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 05D0 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 05D0 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 0022 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 0022 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 002C × 200D ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 × 200D ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 00A9 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 00A9 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 0020 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 0020 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 0000 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 0000 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 0061 × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 0061 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 0061 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 ÷ 002C × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 × 002C × 0031 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] COMMA (MidNum) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 × 002C × 0308 × 0031 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 × 002C × 0031 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] COMMA (MidNum) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 × 002C × 0308 × 0031 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 × 002C × 0031 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] COMMA (MidNum) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 × 002C × 0308 × 0031 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 × 002C × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] COMMA (MidNum) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 × 002C × 0308 × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] COMMA (MidNum) × [4.0] COMBINING DIAERESIS (Extend) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 000D ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [3.2] (CR) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 000D ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (CR) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 000A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [3.2] (LF) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 000A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (LF) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 000B ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 000B ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [3.2] (Newline) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0300 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 × 0300 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] COMBINING GRAVE ACCENT (Extend) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 00AD ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 × 00AD ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] SOFT HYPHEN (Format) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 3031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 3031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 24C2 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 24C2 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] CIRCLED LATIN CAPITAL LETTER M (ALetter_ExtPict) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 0041 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 0041 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 002E ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 002E ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] FULL STOP (MidNumLet) ÷ [0.3] +÷ 0031 × 002E × 2060 × 0030 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [11.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0031 × 002E × 2060 × 0308 × 0030 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [11.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 005F ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 005F ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 1F1E6 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 1F1E6 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 05D0 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 05D0 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 0022 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 0022 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] QUOTATION MARK (Double_Quote) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 200D ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 × 200D ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 00A9 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 00A9 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] COPYRIGHT SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 0020 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 0020 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 0000 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 0000 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] (XXmExtPict) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 0061 × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 0061 × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 0061 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 0061 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 0061 ÷ 0027 × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] APOSTROPHE (Single_Quote) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 ÷ 002E × 2060 × 0308 ÷ 0061 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 × 002E × 2060 × 0031 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 × 002E × 2060 × 0308 × 0031 ÷ 003A ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [0.3] +÷ 0031 × 002E × 2060 × 0031 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 × 002E × 2060 × 0308 × 0031 ÷ 0027 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 0031 × 002E × 2060 × 0031 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 × 002E × 2060 × 0308 × 0031 ÷ 002C ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [0.3] +÷ 0031 × 002E × 2060 × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 0031 × 002E × 2060 × 0308 × 0031 ÷ 002E × 2060 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [12.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) × [4.0] COMBINING DIAERESIS (Extend) × [11.0] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) × [4.0] WORD JOINER (Format) ÷ [0.3] +÷ 000D × 000A ÷ 0061 ÷ 000A ÷ 0308 ÷ # ÷ [0.2] (CR) × [3.0] (LF) ÷ [3.1] LATIN SMALL LETTER A (ALettermExtPict) ÷ [3.2] (LF) ÷ [3.1] COMBINING DIAERESIS (Extend) ÷ [0.3] +÷ 0061 × 0308 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) ÷ [0.3] +÷ 0020 × 200D ÷ 0646 ÷ # ÷ [0.2] SPACE (WSegSpace) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] ARABIC LETTER NOON (ALettermExtPict) ÷ [0.3] +÷ 0646 × 200D ÷ 0020 ÷ # ÷ [0.2] ARABIC LETTER NOON (ALettermExtPict) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] SPACE (WSegSpace) ÷ [0.3] +÷ 0671 × 0644 × 0631 × 064E × 0651 × 062D × 0650 × 064A × 0645 × 0650 ÷ 0020 ÷ 06DD × 0661 ÷ # ÷ [0.2] ARABIC LETTER ALEF WASLA (ALettermExtPict) × [5.0] ARABIC LETTER LAM (ALettermExtPict) × [5.0] ARABIC LETTER REH (ALettermExtPict) × [4.0] ARABIC FATHA (Extend) × [4.0] ARABIC SHADDA (Extend) × [5.0] ARABIC LETTER HAH (ALettermExtPict) × [4.0] ARABIC KASRA (Extend) × [5.0] ARABIC LETTER YEH (ALettermExtPict) × [5.0] ARABIC LETTER MEEM (ALettermExtPict) × [4.0] ARABIC KASRA (Extend) ÷ [999.0] SPACE (WSegSpace) ÷ [999.0] ARABIC END OF AYAH (Numeric) × [8.0] ARABIC-INDIC DIGIT ONE (Numeric) ÷ [0.3] +÷ 0721 × 0719 × 0721 × 0718 × 072A × 0710 ÷ 0020 ÷ 070F × 071D × 0717 ÷ # ÷ [0.2] SYRIAC LETTER MIM (ALettermExtPict) × [5.0] SYRIAC LETTER ZAIN (ALettermExtPict) × [5.0] SYRIAC LETTER MIM (ALettermExtPict) × [5.0] SYRIAC LETTER WAW (ALettermExtPict) × [5.0] SYRIAC LETTER RISH (ALettermExtPict) × [5.0] SYRIAC LETTER ALAPH (ALettermExtPict) ÷ [999.0] SPACE (WSegSpace) ÷ [999.0] SYRIAC ABBREVIATION MARK (ALettermExtPict) × [5.0] SYRIAC LETTER YUDH (ALettermExtPict) × [5.0] SYRIAC LETTER HE (ALettermExtPict) ÷ [0.3] +÷ 072C × 070F × 072B × 0712 × 0718 ÷ # ÷ [0.2] SYRIAC LETTER TAW (ALettermExtPict) × [5.0] SYRIAC ABBREVIATION MARK (ALettermExtPict) × [5.0] SYRIAC LETTER SHIN (ALettermExtPict) × [5.0] SYRIAC LETTER BETH (ALettermExtPict) × [5.0] SYRIAC LETTER WAW (ALettermExtPict) ÷ [0.3] +÷ 0041 × 0041 × 0041 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [5.0] LATIN CAPITAL LETTER A (ALettermExtPict) × [5.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0041 × 003A × 0041 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [6.0] COLON (MidLetter) × [7.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0041 ÷ 003A ÷ 003A ÷ 0041 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 05D0 × 0027 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [7.1] APOSTROPHE (Single_Quote) ÷ [0.3] +÷ 05D0 × 0022 × 05D0 ÷ # ÷ [0.2] HEBREW LETTER ALEF (Hebrew_Letter) × [7.2] QUOTATION MARK (Double_Quote) × [7.3] HEBREW LETTER ALEF (Hebrew_Letter) ÷ [0.3] +÷ 0041 × 0030 × 0030 × 0041 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [9.0] DIGIT ZERO (Numeric) × [8.0] DIGIT ZERO (Numeric) × [10.0] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0030 × 002C × 0030 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) × [12.0] COMMA (MidNum) × [11.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 0030 ÷ 002C ÷ 002C ÷ 0030 ÷ # ÷ [0.2] DIGIT ZERO (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ZERO (Numeric) ÷ [0.3] +÷ 3031 × 3031 ÷ # ÷ [0.2] VERTICAL KANA REPEAT MARK (Katakana) × [13.0] VERTICAL KANA REPEAT MARK (Katakana) ÷ [0.3] +÷ 0041 × 005F × 0030 × 005F × 3031 × 005F ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ZERO (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] VERTICAL KANA REPEAT MARK (Katakana) × [13.1] LOW LINE (ExtendNumLet) ÷ [0.3] +÷ 0041 × 005F × 005F × 0041 ÷ # ÷ [0.2] LATIN CAPITAL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN CAPITAL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 1F1E6 × 1F1E7 ÷ 1F1E8 ÷ 0062 ÷ # ÷ [0.2] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [15.0] REGIONAL INDICATOR SYMBOL LETTER B (RI) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER C (RI) ÷ [999.0] LATIN SMALL LETTER B (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 1F1E6 × 1F1E7 ÷ 1F1E8 ÷ 0062 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [16.0] REGIONAL INDICATOR SYMBOL LETTER B (RI) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER C (RI) ÷ [999.0] LATIN SMALL LETTER B (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 1F1E6 × 1F1E7 × 200D ÷ 1F1E8 ÷ 0062 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [16.0] REGIONAL INDICATOR SYMBOL LETTER B (RI) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER C (RI) ÷ [999.0] LATIN SMALL LETTER B (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 1F1E6 × 200D × 1F1E7 ÷ 1F1E8 ÷ 0062 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [4.0] ZERO WIDTH JOINER (ZWJ) × [16.0] REGIONAL INDICATOR SYMBOL LETTER B (RI) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER C (RI) ÷ [999.0] LATIN SMALL LETTER B (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 1F1E6 × 1F1E7 ÷ 1F1E8 × 1F1E9 ÷ 0062 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER A (RI) × [16.0] REGIONAL INDICATOR SYMBOL LETTER B (RI) ÷ [999.0] REGIONAL INDICATOR SYMBOL LETTER C (RI) × [16.0] REGIONAL INDICATOR SYMBOL LETTER D (RI) ÷ [999.0] LATIN SMALL LETTER B (ALettermExtPict) ÷ [0.3] +÷ 1F476 × 1F3FF ÷ 1F476 ÷ # ÷ [0.2] BABY (ExtPictmALetter) × [4.0] EMOJI MODIFIER FITZPATRICK TYPE-6 (Extend) ÷ [999.0] BABY (ExtPictmALetter) ÷ [0.3] +÷ 1F6D1 × 200D × 1F6D1 ÷ # ÷ [0.2] OCTAGONAL SIGN (ExtPictmALetter) × [4.0] ZERO WIDTH JOINER (ZWJ) × [3.3] OCTAGONAL SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0061 × 200D × 1F6D1 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] ZERO WIDTH JOINER (ZWJ) × [3.3] OCTAGONAL SIGN (ExtPictmALetter) ÷ [0.3] +÷ 2701 × 200D ÷ 2701 ÷ # ÷ [0.2] UPPER BLADE SCISSORS (XXmExtPict) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] UPPER BLADE SCISSORS (XXmExtPict) ÷ [0.3] +÷ 0061 × 200D ÷ 2701 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] ZERO WIDTH JOINER (ZWJ) ÷ [999.0] UPPER BLADE SCISSORS (XXmExtPict) ÷ [0.3] +÷ 1F476 × 1F3FF × 0308 × 200D × 1F476 × 1F3FF ÷ # ÷ [0.2] BABY (ExtPictmALetter) × [4.0] EMOJI MODIFIER FITZPATRICK TYPE-6 (Extend) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) × [3.3] BABY (ExtPictmALetter) × [4.0] EMOJI MODIFIER FITZPATRICK TYPE-6 (Extend) ÷ [0.3] +÷ 1F6D1 × 1F3FF ÷ # ÷ [0.2] OCTAGONAL SIGN (ExtPictmALetter) × [4.0] EMOJI MODIFIER FITZPATRICK TYPE-6 (Extend) ÷ [0.3] +÷ 200D × 1F6D1 × 1F3FF ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [3.3] OCTAGONAL SIGN (ExtPictmALetter) × [4.0] EMOJI MODIFIER FITZPATRICK TYPE-6 (Extend) ÷ [0.3] +÷ 200D × 1F6D1 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [3.3] OCTAGONAL SIGN (ExtPictmALetter) ÷ [0.3] +÷ 200D × 1F6D1 ÷ # ÷ [0.2] ZERO WIDTH JOINER (ZWJ) × [3.3] OCTAGONAL SIGN (ExtPictmALetter) ÷ [0.3] +÷ 1F6D1 ÷ 1F6D1 ÷ # ÷ [0.2] OCTAGONAL SIGN (ExtPictmALetter) ÷ [999.0] OCTAGONAL SIGN (ExtPictmALetter) ÷ [0.3] +÷ 0061 × 0308 × 200D × 0308 × 0062 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [4.0] COMBINING DIAERESIS (Extend) × [4.0] ZERO WIDTH JOINER (ZWJ) × [4.0] COMBINING DIAERESIS (Extend) × [5.0] LATIN SMALL LETTER B (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 0020 × 0020 ÷ 0062 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] SPACE (WSegSpace) × [3.4] SPACE (WSegSpace) ÷ [999.0] LATIN SMALL LETTER B (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 003A ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 003A ÷ 003A ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 003A ÷ 003A ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 003A ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 003A ÷ 003A ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 003A ÷ 003A ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 002E ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 003A ÷ 002E ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 003A ÷ 002E ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 002E ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 003A ÷ 002E ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 003A ÷ 002E ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 002C ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 003A ÷ 002C ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 003A ÷ 002C ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 ÷ 003A ÷ 002C ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 003A ÷ 002C ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 003A ÷ 002C ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 002E ÷ 003A ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 002E ÷ 003A ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 002E ÷ 003A ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 ÷ 002E ÷ 003A ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 002E ÷ 003A ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 002E ÷ 003A ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 002E ÷ 002E ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 002E ÷ 002E ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 002E ÷ 002E ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 ÷ 002E ÷ 002E ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 002E ÷ 002E ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 002E ÷ 002E ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 002E ÷ 002C ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 002E ÷ 002C ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 002E ÷ 002C ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 ÷ 002E ÷ 002C ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 002E ÷ 002C ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 002E ÷ 002C ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 003A ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 002C ÷ 003A ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 002C ÷ 003A ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 003A ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 002C ÷ 003A ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 002C ÷ 003A ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 002E ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 002C ÷ 002E ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 002C ÷ 002E ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 002E ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 002C ÷ 002E ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 002C ÷ 002E ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 002C ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 002C ÷ 002C ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 002C ÷ 002C ÷ 0031 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0031 ÷ 002C ÷ 002C ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0031 ÷ 002C ÷ 002C ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0031 × 005F × 0061 ÷ 002C ÷ 002C ÷ 0061 ÷ # ÷ [0.2] DIGIT ONE (Numeric) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 003A ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 003A ÷ 003A ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 003A ÷ 003A ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 003A ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 003A ÷ 003A ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 003A ÷ 003A ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 002E ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 003A ÷ 002E ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 003A ÷ 002E ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 002E ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 003A ÷ 002E ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 003A ÷ 002E ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 002C ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 003A ÷ 002C ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 003A ÷ 002C ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 ÷ 003A ÷ 002C ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 003A ÷ 002C ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 003A ÷ 002C ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COLON (MidLetter) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 002E ÷ 003A ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 002E ÷ 003A ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 002E ÷ 003A ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 ÷ 002E ÷ 003A ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 002E ÷ 003A ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 002E ÷ 003A ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 002E ÷ 002E ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 002E ÷ 002E ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 002E ÷ 002E ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 ÷ 002E ÷ 002E ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 002E ÷ 002E ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 002E ÷ 002E ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 002E ÷ 002C ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 002E ÷ 002C ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 002E ÷ 002C ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 ÷ 002E ÷ 002C ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 002E ÷ 002C ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 002E ÷ 002C ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 003A ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 002C ÷ 003A ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 002C ÷ 003A ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COLON (MidLetter) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 003A ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 002C ÷ 003A ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 002C ÷ 003A ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COLON (MidLetter) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 002E ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 002C ÷ 002E ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 002C ÷ 002E ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 002E ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 002C ÷ 002E ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 002C ÷ 002E ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] FULL STOP (MidNumLet) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 002C ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 002C ÷ 002C ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 002C ÷ 002C ÷ 0031 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COMMA (MidNum) ÷ [999.0] DIGIT ONE (Numeric) ÷ [0.3] +÷ 0061 ÷ 002C ÷ 002C ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0031 ÷ 002C ÷ 002C ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] DIGIT ONE (Numeric) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +÷ 0061 × 005F × 0061 ÷ 002C ÷ 002C ÷ 0061 ÷ # ÷ [0.2] LATIN SMALL LETTER A (ALettermExtPict) × [13.1] LOW LINE (ExtendNumLet) × [13.2] LATIN SMALL LETTER A (ALettermExtPict) ÷ [999.0] COMMA (MidNum) ÷ [999.0] COMMA (MidNum) ÷ [999.0] LATIN SMALL LETTER A (ALettermExtPict) ÷ [0.3] +# +# Lines: 1944 +# +# EOF diff --git a/opennlp-docs/src/docbkx/doccat.xml b/opennlp-docs/src/docbkx/doccat.xml index 7d03f1c2a..e12186ec4 100644 --- a/opennlp-docs/src/docbkx/doccat.xml +++ b/opennlp-docs/src/docbkx/doccat.xml @@ -171,6 +171,24 @@ String category = myCategorizer.getBestCategory(outcomes);]]> For additional examples, refer to the DocumentCategorizerDLEval class. + + Like NameFinderDL, long input is split into overlapping chunks on the full + Unicode White_Space set rather than Java's \s, so text copied + from PDFs, the web, or multilingual sources tokenizes consistently. Optional + preprocessing through InferenceOptions is off by default: + setNormalizeWhitespace(true) maps each Unicode whitespace code point to an + ASCII space, and setNormalizeDashes(true) maps Unicode dashes to the ASCII + hyphen-minus. Both are one-to-one replacements that preserve character offsets. See + for the shared CharClass engine and the + full normalization library. + + + + diff --git a/opennlp-docs/src/docbkx/introduction.xml b/opennlp-docs/src/docbkx/introduction.xml index e7ac5c7c3..82e53cccb 100644 --- a/opennlp-docs/src/docbkx/introduction.xml +++ b/opennlp-docs/src/docbkx/introduction.xml @@ -303,7 +303,8 @@ Arguments description: and Document Categorizer. This allows models trained by other frameworks such as PyTorch and Tensorflow to be used by OpenNLP. The documentation for each of the OpenNLP components that supports ONNX models describes how to - use ONNX models for inference. + use ONNX models for inference. DL inference uses Unicode-aware text chunking and + optional input normalization; see . diff --git a/opennlp-docs/src/docbkx/namefinder.xml b/opennlp-docs/src/docbkx/namefinder.xml index ff695d898..6c2c759c0 100644 --- a/opennlp-docs/src/docbkx/namefinder.xml +++ b/opennlp-docs/src/docbkx/namefinder.xml @@ -157,11 +157,36 @@ Span[] nameSpans = nameFinder.find(sentence);]]> File vocab = new File("/path/to/vocab.txt"); Map categories = new HashMap<>(); String[] tokens = new String[]{"George", "Washington", "was", "president", "of", "the", "United", "States", "."}; -NameFinderDL nameFinderDL = new NameFinderDL(model, vocab, false, getIds2Labels()); +NameFinderDL nameFinderDL = new NameFinderDL(model, vocab, getIds2Labels(), sentenceDetector); Span[] spans = nameFinderDL.find(tokens);]]> For additional examples, refer to the NameFinderDLEval class. + + Long input text is split into overlapping chunks on the full Unicode + White_Space set before WordPiece tokenization, so spacing such as a + no-break space or the CJK ideographic space is recognized as a delimiter. After + inference, reconstructed entity text is matched back to the caller's original input + with a Unicode-aware cursor scan (not a regular expression), so + Span#getCoveredText(...) returns the source text even when WordPiece + rejoins sub-tokens with spaces or when the source uses non-ASCII whitespace between + tokens. + + + Optional preprocessing of the joined input text is available through + InferenceOptions and is off by default: + setNormalizeWhitespace(true) folds each Unicode whitespace character to + an ASCII space, and setNormalizeDashes(true) folds Unicode dashes to the + ASCII hyphen-minus. Both transforms are one code point to one character and preserve + offsets. Full details, the underlying CharClass engine, and the broader + normalization pipeline are documented in . + + + + diff --git a/opennlp-docs/src/docbkx/normalizer.xml b/opennlp-docs/src/docbkx/normalizer.xml index d14177db1..173a34c76 100644 --- a/opennlp-docs/src/docbkx/normalizer.xml +++ b/opennlp-docs/src/docbkx/normalizer.xml @@ -60,9 +60,15 @@ - There are two layers. The CharSequenceNormalizer family offers ready-made, - composable normalizers; the CharClass engine is the low-level, configurable - building block they are made of. + Two engines underpin everything: the CharSequenceNormalizer family offers + ready-made, composable normalizers, and the CharClass engine is the low-level, + configurable building block they are made of. Built on these are three higher-level + features documented below: a layered term model that projects a token through a + configurable stack of transforms while keeping every intermediate form (see + ), per-language profiles that select the transforms + appropriate to a language (see ), and confusable + folding that reduces lookalike characters for matching (see + ). @@ -141,6 +147,16 @@ AccentFoldCharSequenceNormalizer Folds diacritics in a script-aware way (see below). + + GermanUmlautCharSequenceNormalizer + Transliterates German umlauts and the eszett (a-umlaut to ae, + eszett to ss; DIN 5007-2). + + + ConfusableSkeletonCharSequenceNormalizer + Reduces lookalike characters to a confusable skeleton for matching + (UTS #39); see below. + @@ -184,9 +200,72 @@ String t = search.normalize("“CafÉ”").toString(); // "\"cafe\""]]> Any custom CharSequenceNormalizer can be inserted with - with(...). None of these normalizers is applied automatically by any OpenNLP - component; normalization is always an explicit, opt-in choice. + with(...). The TextNormalizer pipeline and the individual + CharSequenceNormalizer implementations are not applied automatically by + statistical OpenNLP components; callers compose them explicitly when preprocessing text + for search or matching. The DL components described in the next section use a narrower, + built-in subset of this machinery. + + + +
+ Use in DL components + + NameFinderDL and DocumentCategorizerDL share Unicode-aware text + handling through AbstractDL. Long inputs are split into overlapping chunks + on the full Unicode White_Space set (no-break space, ideographic space, line + and paragraph separators, and the other members listed under + ), not on Java's six-character + \s subset. Empty tokens from leading, trailing, or repeated whitespace are + not produced. + + + NameFinderDL additionally locates reconstructed entity text in the original + input with a cursor-based matcher: a space in the reconstructed span matches zero or more + Unicode whitespace code points in the source, and every other code point is compared + case-insensitively. This replaces the previous regular-expression approach and correctly + handles spacing copied from PDFs, the web, or non-Latin sources when resolving + Span#getCoveredText(...). + + + Optional input folding is controlled through InferenceOptions and is + off by default so existing models keep their prior inputs unless + you opt in: + + + + + setNormalizeWhitespace(true) maps each Unicode whitespace code point + to a single ASCII space before inference. The transform is one code point to one + space, so character offsets stay aligned with the input. + + + + + setNormalizeDashes(true) maps each dash in the default + CharClass.dashes() set to the ASCII hyphen-minus. Mathematical minus + signs and the soft hyphen are not affected unless you extend the set explicitly. + This replacement is also one code point to one character for Basic Multilingual + Plane dashes. + + + + + Run-collapsing normalization (for example WhitespaceCharSequenceNormalizer, + which collapses whitespace runs to a single space) is not enabled + through these flags because it would shift character offsets. Use the + CharSequenceNormalizer pipeline directly when you need that behavior on text + that does not require offset-preserving span lookup. See also + and + . + + ASCII space +options.setNormalizeDashes(true); // opt-in: en dash, em dash, ... -> hyphen-minus + +NameFinderDL finder = new NameFinderDL(model, vocab, ids2Labels, options, sentenceDetector);]]> +
@@ -232,8 +311,9 @@ CharSequenceNormalizer latinOnly = new AccentFoldCharSequenceNormalizer( The set-based normalizers are built on CharClass, a configurable class of Unicode code points paired with a single canonical replacement, backed by a - CodePointSet with O(1) membership. Whitespace and dashes are the two built-in - presets, and any other class is one more configured instance: + CodePointSet with O(1) membership. You choose both the membership and the + replacement code point with CharClass.of(members, replacement); whitespace and + dashes are the two built-in presets, and any other class is one more configured instance: U+0020 @@ -244,6 +324,48 @@ ws.trim(" hi "); // "hi" String[] tokens = ws.split("one two"); // ["one", "two"] (offset-aware via splitSpans) dash.normalize("a—b"); // "a-b"]]> + + A class applies its replacement three ways, which differ in whether they collapse runs and + whether they preserve character offsets: + + + + + normalize(text) replaces each member one-for-one with the replacement, + so it is length- and offset-preserving; use it when you still need spans back into + the original text. + + + + + collapse(text) reduces each maximal run of members to a single + replacement; it changes length, so it is a search and match transform. + + + + + collapsePreserving(text, keep, keepReplacement) collapses runs but emits + keepReplacement for any run containing a kept code point, which is how + you squish horizontal whitespace while keeping line breaks. + + + + + So the replacement is your choice and the method picks the behavior. Folding tabs and + newlines to a single newline, for example, is one configured class: + + + + + + When you need the normalized form together with a map back to the original, the + normalizeMapped and collapseMapped variants return a + NormalizedText that carries the offset map. + A CodePointSet can be built explicitly, as a range, by union, or loaded from a user definitions file so that delimiters can be extended without a code change. The @@ -287,6 +409,118 @@ List terms = analyzer.terms("Café au lait"); // ["cafe", "au", "lait"
+
+ The layered term model + + Where TextAnalyzer gives each token an original and a single normalized form, + TermAnalyzer gives each token a stack of normalization + layers. A Term is one token projected through an ordered chain of + Dimensions: original, NFC, NFKC, whitespace, dash, case fold, accent fold, + confusable fold, stem, and lemma. The order is fixed because the transforms do not commute + (case folding then accent folding differs from the reverse). The original is always kept, + so aggressive folding stays safe and a match on any layer maps back to the source through + the token's Span. + + + "Running" +// term.normalized() -> "run" (the final configured dimension, here STEM) +// term.peel() -> "running" (the layer below the top, O(1)) +// term.at(Dimension.NFC) -> computed lazily on first request, then cached]]> + + + Segmentation uses the word tokenizer, so the input + does not need to be pre-tokenized. The dimensions named in the builder are computed eagerly; + any other dimension is computed on first request, applied on top of the final form, and + cached, so querying a configured layer or peeling the last one is O(1) and adding an + unrequested dimension costs one transform. The character-level dimensions have built-in + defaults; STEM and LEMMA require a + Stemmer or Lemmatizer (and LEMMA a part-of-speech + tag), and fail loudly if requested without them. An analyzer configured with a stemmer is + not thread-safe, because the Snowball stemmers are stateful. + + + Each dimension's transform is configurable on the builder. Beyond the no-argument methods + that enable a dimension with its default, there are convenience methods for the common + knobs, and a general transform(dimension, normalizer) escape hatch for any + character-level dimension: + + + + + + The whitespace and dash methods take any CharSequenceNormalizer, so a + CharClass method reference (::normalize for one-for-one, + ::collapse for run-collapsing) selects both the fold target and the behavior. + The case-fold method takes a Locale for language-specific rules such as the + Turkish dotted/dotless i, and the accent-fold method takes the scripts to fold and whether + to fold stroke letters. + +
+ +
+ Confusable (homoglyph) folding + + Confusables reduces text to its Unicode confusable + skeleton following + UTS #39: it decomposes the + text, replaces each code point with its prototype, and decomposes again. Two strings are + confusable exactly when their skeletons are equal, which catches spoofing where Cyrillic or + Greek letters imitate Latin ones. + + + + + + The skeleton changes length and offsets, so like accent folding it is a derived, + matching-only form. It is also available as + ConfusableSkeletonCharSequenceNormalizer and as the + CONFUSABLE_FOLD term dimension. The mapping comes from the bundled Unicode + security data file confusables.txt. + +
+ +
+ Per-language profiles + + NormalizationProfiles selects per-language settings the same way OpenNLP + already selects a Snowball stemmer by language: ask for a language, or detect it with a + LanguageDetector when it is unspecified. Each + NormalizationProfile pairs a language with its Snowball stemmer and the + diacritic fold appropriate for that language, and builds a search-oriented + TermAnalyzer. + + + + + + The diacritic fold is the generic accent fold for English and the major Romance languages, + the German-specific fold (a-umlaut to ae, eszett to ss, following + DIN 5007-2) for German, and none for the Nordic languages and non-Latin scripts, where + folding distinct letters is language-wrong. As stated in + , this is a search-recall choice, not + linguistic correctness; a caller that wants different behavior builds a + TermAnalyzer directly. + +
+
Reference data diff --git a/opennlp-docs/src/docbkx/tokenizer.xml b/opennlp-docs/src/docbkx/tokenizer.xml index b6fb7b074..7bb3356de 100644 --- a/opennlp-docs/src/docbkx/tokenizer.xml +++ b/opennlp-docs/src/docbkx/tokenizer.xml @@ -23,7 +23,16 @@ The OpenNLP Tokenizers segment an input character sequence into tokens. Tokens are usually words, punctuation, numbers, etc. - + + + The statistical tokenizers in this chapter assume conventional whitespace-separated training + and test data. When input contains Unicode spacing or dash variants (no-break space, + ideographic space, en dash, and similar characters from PDFs or the web), use the + Unicode-aware preprocessing described in . The DL + components apply that machinery automatically for document chunking; see + . + +
+ +
+ Unicode Word Segmentation (UAX #29) + + The package opennlp.tools.tokenize.uax29 provides a tokenizer that follows the + Unicode Text Segmentation algorithm + (UAX #29), word boundary + rules WB1 through WB999. It is rule based and needs no trained model, it works directly over + a CharSequence, and it reports character offsets so the original text is + preserved for downstream processing such as the normalization described in + . The boundary data comes from the bundled Unicode + Character Database (currently Unicode 17.0) and the implementation passes the official + WordBreakTest conformance suite for that release. + +
+ Word Segmenter + + WordSegmenter finds the word boundaries. It is a single forward cursor pass + with constant-time property look-ups and no regular expression. Every segment is + reported, including whitespace and punctuation runs, so the segments are contiguous and + together cover the whole text. + + segments = WordSegmenter.segments("The quick brown fox.");]]> + + For allocation-free processing of large inputs, stream the segments to a callback instead + of collecting them. + + { + // handle the segment [start, end) +});]]> + + +
+
+ Word Tokenizer + + WordTokenizer builds on the segmenter. It keeps the segments that are words + (letters, digits, ideographs, kana, Hangul, a Southeast Asian script, or emoji), drops + whitespace and punctuation, and classifies each token. It implements the standard + Tokenizer interface, so it can be used wherever a tokenizer is expected. + + + + The tokens array contains "The", "quick", "brown", and "fox"; the trailing period and the + spaces are dropped. The tokenizeTyped method additionally returns the + category of each token as a WordType. + + + + The categories are ALPHANUMERIC, NUMERIC, + IDEOGRAPHIC, HIRAGANA, KATAKANA, + HANGUL, SOUTHEAST_ASIAN, and EMOJI. + + + A streaming overload reports each token to a handler with no per-token allocation, which + is the fastest option when the tokens are consumed on the fly. + + { + // handle the token [start, end) of the given WordType +});]]> + + A token longer than the maximum token length is emitted as consecutive pieces without + splitting a surrogate pair. The maximum defaults to + WordTokenizer.DEFAULT_MAX_TOKEN_LENGTH and can be set through the + constructor. + + + + +
+
From 2d980fba9844e581bf72853d1f560454b750fe8e Mon Sep 17 00:00:00 2001 From: Kristian Rickert Date: Fri, 19 Jun 2026 16:48:38 -0400 Subject: [PATCH 09/11] OPENNLP-1850 Move normalizer engine out of opennlp-api; unify token analysis on Term - Relocate CharClass, CodePointSet, UnicodeWhitespace, UnicodeDash (and their tests) from opennlp-api to opennlp-runtime, so the API keeps only contracts (CharSequenceNormalizer) and table-free value types (NormalizedText, OffsetMap). - opennlp-dl now depends on opennlp-runtime (compile) for the CharClass it uses to chunk input (AbstractDL). - Delete TextAnalyzer/AnalyzedToken: TermAnalyzer/Term is the single token-analysis entry point; original/normalized/span are read from Term. Manual updated. --- .../tools/util/normalizer/AnalyzedToken.java | 34 ------ .../tools/util/normalizer/TextAnalyzer.java | 93 ---------------- .../util/normalizer/TextAnalyzerTest.java | 102 ------------------ opennlp-core/opennlp-ml/opennlp-dl/pom.xml | 12 +-- .../tools/util/normalizer/CharClass.java | 0 .../tools/util/normalizer/CodePointSet.java | 0 .../tools/util/normalizer/UnicodeDash.java | 0 .../util/normalizer/UnicodeWhitespace.java | 0 .../tools/util/normalizer/CharClassTest.java | 0 .../util/normalizer/CodePointSetTest.java | 0 .../util/normalizer/UnicodeDashTest.java | 0 .../normalizer/UnicodeWhitespaceTest.java | 0 opennlp-docs/src/docbkx/normalizer.xml | 31 ++---- 13 files changed, 11 insertions(+), 261 deletions(-) delete mode 100644 opennlp-api/src/main/java/opennlp/tools/util/normalizer/AnalyzedToken.java delete mode 100644 opennlp-api/src/main/java/opennlp/tools/util/normalizer/TextAnalyzer.java delete mode 100644 opennlp-api/src/test/java/opennlp/tools/util/normalizer/TextAnalyzerTest.java rename {opennlp-api => opennlp-core/opennlp-runtime}/src/main/java/opennlp/tools/util/normalizer/CharClass.java (100%) rename {opennlp-api => opennlp-core/opennlp-runtime}/src/main/java/opennlp/tools/util/normalizer/CodePointSet.java (100%) rename {opennlp-api => opennlp-core/opennlp-runtime}/src/main/java/opennlp/tools/util/normalizer/UnicodeDash.java (100%) rename {opennlp-api => opennlp-core/opennlp-runtime}/src/main/java/opennlp/tools/util/normalizer/UnicodeWhitespace.java (100%) rename {opennlp-api => opennlp-core/opennlp-runtime}/src/test/java/opennlp/tools/util/normalizer/CharClassTest.java (100%) rename {opennlp-api => opennlp-core/opennlp-runtime}/src/test/java/opennlp/tools/util/normalizer/CodePointSetTest.java (100%) rename {opennlp-api => opennlp-core/opennlp-runtime}/src/test/java/opennlp/tools/util/normalizer/UnicodeDashTest.java (100%) rename {opennlp-api => opennlp-core/opennlp-runtime}/src/test/java/opennlp/tools/util/normalizer/UnicodeWhitespaceTest.java (100%) diff --git a/opennlp-api/src/main/java/opennlp/tools/util/normalizer/AnalyzedToken.java b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/AnalyzedToken.java deleted file mode 100644 index 389146596..000000000 --- a/opennlp-api/src/main/java/opennlp/tools/util/normalizer/AnalyzedToken.java +++ /dev/null @@ -1,34 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package opennlp.tools.util.normalizer; - -import opennlp.tools.util.Span; - -/** - * One analyzed token: its character span in the source text, the original token text, and the - * normalized form used for matching or indexing. - * - *

The span ties the normalized term back to the original text, so a search hit on - * {@link #normalized()} can be highlighted against the source using {@link #span()} even though - * the normalized form may differ in length (for example after diacritic folding).

- * - * @param span The character span of the token in the source text. - * @param original The original token text. - * @param normalized The normalized token text (the match/index form). - */ -public record AnalyzedToken(Span span, String original, String normalized) { -} diff --git a/opennlp-api/src/main/java/opennlp/tools/util/normalizer/TextAnalyzer.java b/opennlp-api/src/main/java/opennlp/tools/util/normalizer/TextAnalyzer.java deleted file mode 100644 index 7e8ce8d77..000000000 --- a/opennlp-api/src/main/java/opennlp/tools/util/normalizer/TextAnalyzer.java +++ /dev/null @@ -1,93 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package opennlp.tools.util.normalizer; - -import java.util.ArrayList; -import java.util.List; -import java.util.Objects; - -import opennlp.tools.util.Span; - -/** - * Splits text into tokens and normalizes each one, keeping every token's original character span. - * - *

This is the offset-preserving building block for search and BM25-style matching: tokens are - * found with a {@link CharClass} splitter (O(1) membership, a single cursor pass, no regular - * expression) and each token's text is run through a {@link CharSequenceNormalizer}. The result is - * a list of {@link AnalyzedToken}, each carrying the source {@link Span} alongside its normalized - * form, so a match on the normalized term can always be reported and highlighted against the - * original text even when normalization changes a token's length.

- */ -public final class TextAnalyzer { - - private final CharClass splitter; - private final CharSequenceNormalizer normalizer; - - /** - * Creates an analyzer. - * - * @param splitter The character class whose members delimit tokens (typically - * {@link CharClass#whitespace()}). - * @param normalizer The per-token normalizer. - */ - public TextAnalyzer(CharClass splitter, CharSequenceNormalizer normalizer) { - this.splitter = Objects.requireNonNull(splitter, "splitter"); - this.normalizer = Objects.requireNonNull(normalizer, "normalizer"); - } - - /** - * Creates an analyzer that splits on Unicode whitespace. - * - * @param normalizer The per-token normalizer. - * @return The analyzer. - */ - public static TextAnalyzer whitespace(CharSequenceNormalizer normalizer) { - return new TextAnalyzer(CharClass.whitespace(), normalizer); - } - - /** - * Tokenizes {@code text} and normalizes each token. - * - * @param text The text to analyze. - * @return The analyzed tokens, in order, each with its source span and normalized form. - */ - public List analyze(CharSequence text) { - Objects.requireNonNull(text, "text"); - final List tokens = new ArrayList<>(); - for (final Span span : splitter.splitSpans(text)) { - final String original = text.subSequence(span.getStart(), span.getEnd()).toString(); - final String normalized = normalizer.normalize(original).toString(); - tokens.add(new AnalyzedToken(span, original, normalized)); - } - return tokens; - } - - /** - * Tokenizes {@code text} and returns only the normalized terms. - * - * @param text The text to analyze. - * @return The normalized token terms, in order. - */ - public List terms(CharSequence text) { - final List analyzed = analyze(text); - final List terms = new ArrayList<>(analyzed.size()); - for (final AnalyzedToken token : analyzed) { - terms.add(token.normalized()); - } - return terms; - } -} diff --git a/opennlp-api/src/test/java/opennlp/tools/util/normalizer/TextAnalyzerTest.java b/opennlp-api/src/test/java/opennlp/tools/util/normalizer/TextAnalyzerTest.java deleted file mode 100644 index 77decf860..000000000 --- a/opennlp-api/src/test/java/opennlp/tools/util/normalizer/TextAnalyzerTest.java +++ /dev/null @@ -1,102 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package opennlp.tools.util.normalizer; - -import java.util.List; -import java.util.Locale; - -import org.junit.jupiter.api.Test; - -import static org.junit.jupiter.api.Assertions.assertEquals; -import static org.junit.jupiter.api.Assertions.assertThrows; -import static org.junit.jupiter.api.Assertions.assertTrue; - -public class TextAnalyzerTest { - - private static final CharSequenceNormalizer LOWER = s -> s.toString().toLowerCase(Locale.ROOT); - - private static String cp(int codePoint) { - return new String(Character.toChars(codePoint)); - } - - @Test - void testAnalyzePreservesSpansAndNormalizesTokens() { - final String text = "Hello WORLD"; - final List tokens = TextAnalyzer.whitespace(LOWER).analyze(text); - - assertEquals(2, tokens.size()); - assertEquals(0, tokens.get(0).span().getStart()); - assertEquals(5, tokens.get(0).span().getEnd()); - assertEquals("Hello", tokens.get(0).original()); - assertEquals("hello", tokens.get(0).normalized()); - assertEquals("WORLD", tokens.get(1).original()); - assertEquals("world", tokens.get(1).normalized()); - assertEquals("Hello", tokens.get(0).span().getCoveredText(text).toString()); - } - - @Test - void testSpanStaysCorrectWhenNormalizedLengthChanges() { - final CharSequenceNormalizer bracket = s -> "[" + s + "]"; - final String text = "ab cd"; - final List tokens = TextAnalyzer.whitespace(bracket).analyze(text); - - assertEquals("[ab]", tokens.get(0).normalized()); - assertEquals(0, tokens.get(0).span().getStart()); - assertEquals(2, tokens.get(0).span().getEnd()); - assertEquals(3, tokens.get(1).span().getStart()); - assertEquals(5, tokens.get(1).span().getEnd()); - } - - @Test - void testSplitsOnUnicodeWhitespace() { - final String text = "alpha" + cp(0x00A0) + "beta"; - final List tokens = TextAnalyzer.whitespace(LOWER).analyze(text); - - assertEquals(2, tokens.size()); - assertEquals("alpha", tokens.get(0).normalized()); - assertEquals("beta", tokens.get(1).normalized()); - } - - @Test - void testSupplementaryTokenIsKeptIntact() { - final String emoji = cp(0x1F600); - final String text = "a " + emoji + " b"; - final List tokens = TextAnalyzer.whitespace(LOWER).analyze(text); - - assertEquals(3, tokens.size()); - assertEquals(emoji, tokens.get(1).original()); - assertTrue(tokens.get(1).span().getEnd() - tokens.get(1).span().getStart() == emoji.length()); - } - - @Test - void testTermsReturnsNormalizedFormsOnly() { - assertEquals(List.of("a", "b", "c"), TextAnalyzer.whitespace(LOWER).terms("A B C")); - } - - @Test - void testEmptyAndWhitespaceOnlyYieldNoTokens() { - assertEquals(List.of(), TextAnalyzer.whitespace(LOWER).analyze("")); - assertEquals(List.of(), TextAnalyzer.whitespace(LOWER).analyze(" ")); - } - - @Test - void testRejectsNullArguments() { - assertThrows(NullPointerException.class, () -> new TextAnalyzer(null, LOWER)); - assertThrows(NullPointerException.class, () -> new TextAnalyzer(CharClass.whitespace(), null)); - assertThrows(NullPointerException.class, () -> TextAnalyzer.whitespace(LOWER).analyze(null)); - } -} diff --git a/opennlp-core/opennlp-ml/opennlp-dl/pom.xml b/opennlp-core/opennlp-ml/opennlp-dl/pom.xml index 76d27f6fa..4075c8f6e 100644 --- a/opennlp-core/opennlp-ml/opennlp-dl/pom.xml +++ b/opennlp-core/opennlp-ml/opennlp-dl/pom.xml @@ -37,6 +37,11 @@ org.apache.opennlp opennlp-api + + + org.apache.opennlp + opennlp-runtime + @@ -45,13 +50,6 @@ ${onnxruntime.version} - - - org.apache.opennlp - opennlp-runtime - test - - org.slf4j slf4j-api diff --git a/opennlp-api/src/main/java/opennlp/tools/util/normalizer/CharClass.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/CharClass.java similarity index 100% rename from opennlp-api/src/main/java/opennlp/tools/util/normalizer/CharClass.java rename to opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/CharClass.java diff --git a/opennlp-api/src/main/java/opennlp/tools/util/normalizer/CodePointSet.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/CodePointSet.java similarity index 100% rename from opennlp-api/src/main/java/opennlp/tools/util/normalizer/CodePointSet.java rename to opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/CodePointSet.java diff --git a/opennlp-api/src/main/java/opennlp/tools/util/normalizer/UnicodeDash.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/UnicodeDash.java similarity index 100% rename from opennlp-api/src/main/java/opennlp/tools/util/normalizer/UnicodeDash.java rename to opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/UnicodeDash.java diff --git a/opennlp-api/src/main/java/opennlp/tools/util/normalizer/UnicodeWhitespace.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/UnicodeWhitespace.java similarity index 100% rename from opennlp-api/src/main/java/opennlp/tools/util/normalizer/UnicodeWhitespace.java rename to opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/UnicodeWhitespace.java diff --git a/opennlp-api/src/test/java/opennlp/tools/util/normalizer/CharClassTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/CharClassTest.java similarity index 100% rename from opennlp-api/src/test/java/opennlp/tools/util/normalizer/CharClassTest.java rename to opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/CharClassTest.java diff --git a/opennlp-api/src/test/java/opennlp/tools/util/normalizer/CodePointSetTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/CodePointSetTest.java similarity index 100% rename from opennlp-api/src/test/java/opennlp/tools/util/normalizer/CodePointSetTest.java rename to opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/CodePointSetTest.java diff --git a/opennlp-api/src/test/java/opennlp/tools/util/normalizer/UnicodeDashTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/UnicodeDashTest.java similarity index 100% rename from opennlp-api/src/test/java/opennlp/tools/util/normalizer/UnicodeDashTest.java rename to opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/UnicodeDashTest.java diff --git a/opennlp-api/src/test/java/opennlp/tools/util/normalizer/UnicodeWhitespaceTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/UnicodeWhitespaceTest.java similarity index 100% rename from opennlp-api/src/test/java/opennlp/tools/util/normalizer/UnicodeWhitespaceTest.java rename to opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/UnicodeWhitespaceTest.java diff --git a/opennlp-docs/src/docbkx/normalizer.xml b/opennlp-docs/src/docbkx/normalizer.xml index 173a34c76..55376f538 100644 --- a/opennlp-docs/src/docbkx/normalizer.xml +++ b/opennlp-docs/src/docbkx/normalizer.xml @@ -387,34 +387,15 @@ CharClass wsPlus = CharClass.whitespace().withAdditional(extra);]]> -
- Offset-preserving analysis for search - - TextAnalyzer tokenizes text and normalizes each token while keeping every - token's source span. This is the building block for BM25-style matching: the normalized - term is what you index or query, and the Span ties it back to the original - text for highlighting, even when normalization changes a token's length. - - - character span in the original text - // token.original() -> the raw token, e.g. "Café" - // token.normalized() -> the search term, e.g. "cafe" -} - -List terms = analyzer.terms("Café au lait"); // ["cafe", "au", "lait"]]]> - -
-
The layered term model - Where TextAnalyzer gives each token an original and a single normalized form, - TermAnalyzer gives each token a stack of normalization - layers. A Term is one token projected through an ordered chain of + TermAnalyzer tokenizes text and gives each token a + stack of normalization layers while keeping its source span. It is the + offset-preserving entry point for matching and BM25-style search: the normalized form is + what you index or query, and the span ties every layer back to the original text for + highlighting, even when normalization changes a token's length. A Term is one + token projected through an ordered chain of Dimensions: original, NFC, NFKC, whitespace, dash, case fold, accent fold, confusable fold, stem, and lemma. The order is fixed because the transforms do not commute (case folding then accent folding differs from the reverse). The original is always kept, From fbfc4c92507a5c48e7be6ed27ceea51003ee2613 Mon Sep 17 00:00:00 2001 From: Kristian Rickert Date: Fri, 19 Jun 2026 17:19:21 -0400 Subject: [PATCH 10/11] OPENNLP-1850 Single-source the shared normalization rungs on Dimension Each character-level Dimension now carries its default CharSequenceNormalizer (resolved lazily via a Supplier, so the confusables table is not loaded on enum init). TermAnalyzer drops its parallel defaultTransforms() map and reads the default from the dimension (builder overrides still win); TextNormalizer's nfc, nfkc, whitespace, dash, case-fold, and accent-fold methods delegate to the same source instead of re-listing the normalizers. The shared rungs are now defined once. TextNormalizer-only cleanup steps (quotes, digits, ellipsis, bullets, strip-invisible) stay standalone. --- .../tools/util/normalizer/Dimension.java | 51 +++++++++++++------ .../tools/util/normalizer/TermAnalyzer.java | 21 +++----- .../tools/util/normalizer/TextNormalizer.java | 12 ++--- .../tools/util/normalizer/DimensionTest.java | 45 ++++++++++++++++ 4 files changed, 93 insertions(+), 36 deletions(-) create mode 100644 opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/DimensionTest.java diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/Dimension.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/Dimension.java index 56a9fd629..7caece7a0 100644 --- a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/Dimension.java +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/Dimension.java @@ -16,47 +16,68 @@ */ package opennlp.tools.util.normalizer; +import java.util.function.Supplier; + /** * A layer of the {@link Term} normalization stack, in increasing order of aggressiveness. A * {@link TermAnalyzer} applies a configured prefix of these to each token; the declaration order is * the canonical pipeline order, because the transforms do not commute (case folding then accent * folding differs from the reverse for Turkish dotted/dotless i and the German eszett). * - *

{@link #ORIGINAL} is the source token and is always present. The character-level dimensions - * have a default transform and can therefore be requested from any term. {@link #STEM} and - * {@link #LEMMA} are token-level and require a {@link opennlp.tools.stemmer.Stemmer} or - * {@link opennlp.tools.lemmatizer.Lemmatizer} to be configured on the analyzer; {@link #LEMMA} also - * requires a part-of-speech tag.

+ *

This enum is the single definition of the character-level steps: each one carries its default + * {@link CharSequenceNormalizer}, which both {@link TermAnalyzer} and {@link TextNormalizer} read + * from rather than re-listing. The default is resolved lazily, so loading this enum does not eagerly + * initialize heavy data such as the confusables table.

+ * + *

{@link #ORIGINAL} is the source token and is always present. {@link #STEM} and {@link #LEMMA} + * are token-level and have no default normalizer; they require a + * {@link opennlp.tools.stemmer.Stemmer} or {@link opennlp.tools.lemmatizer.Lemmatizer} on the + * analyzer ({@link #LEMMA} also a part-of-speech tag).

*/ public enum Dimension { /** The original token text, the canonical source of truth. */ - ORIGINAL, + ORIGINAL(null), /** Unicode canonical composition (NFC); lossless under canonical equivalence. */ - NFC, + NFC(NfcCharSequenceNormalizer::getInstance), /** Unicode compatibility composition (NFKC); lossy (for example superscripts to digits). */ - NFKC, + NFKC(NfkcCharSequenceNormalizer::getInstance), /** Unicode whitespace folded to ASCII spaces. */ - WHITESPACE, + WHITESPACE(WhitespaceCharSequenceNormalizer::getInstance), /** Unicode dashes folded to the ASCII hyphen-minus. */ - DASH, + DASH(DashCharSequenceNormalizer::getInstance), /** Case folding; lossy and locale sensitive. */ - CASE_FOLD, + CASE_FOLD(CaseFoldCharSequenceNormalizer::getInstance), /** Diacritic and accent folding; lossy, script gated, and language-wrong for some languages. */ - ACCENT_FOLD, + ACCENT_FOLD(AccentFoldCharSequenceNormalizer::getInstance), /** Confusable (homoglyph) skeleton folding per UTS #39; lossy, for matching only. */ - CONFUSABLE_FOLD, + CONFUSABLE_FOLD(ConfusableSkeletonCharSequenceNormalizer::getInstance), /** Stemming through a configured {@link opennlp.tools.stemmer.Stemmer}. */ - STEM, + STEM(null), /** Lemmatization through a configured {@link opennlp.tools.lemmatizer.Lemmatizer}. */ - LEMMA + LEMMA(null); + + private final Supplier defaultNormalizer; + + Dimension(Supplier defaultNormalizer) { + this.defaultNormalizer = defaultNormalizer; + } + + /** + * {@return the default character-level normalizer for this dimension, or {@code null} for + * {@link #ORIGINAL}, {@link #STEM}, and {@link #LEMMA}} The normalizer is resolved lazily, so it + * is not initialized until first requested. + */ + public CharSequenceNormalizer defaultNormalizer() { + return defaultNormalizer == null ? null : defaultNormalizer.get(); + } } diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/TermAnalyzer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/TermAnalyzer.java index b08da0a9b..7262d580d 100644 --- a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/TermAnalyzer.java +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/TermAnalyzer.java @@ -61,25 +61,13 @@ private TermAnalyzer(Builder builder) { Collections.sort(ordered); // canonical pipeline order (enum declaration order) this.chain = List.copyOf(ordered); this.finalDimension = ordered.isEmpty() ? Dimension.ORIGINAL : ordered.get(ordered.size() - 1); - this.transforms = defaultTransforms(); - this.transforms.putAll(builder.transforms); + // Only the per-analyzer overrides from the builder; the defaults live on Dimension itself. + this.transforms = new EnumMap<>(builder.transforms); this.stemmer = builder.stemmer; this.lemmatizer = builder.lemmatizer; this.tokenizer = builder.tokenizer; } - private static EnumMap defaultTransforms() { - final EnumMap map = new EnumMap<>(Dimension.class); - map.put(Dimension.NFC, NfcCharSequenceNormalizer.getInstance()); - map.put(Dimension.NFKC, NfkcCharSequenceNormalizer.getInstance()); - map.put(Dimension.WHITESPACE, WhitespaceCharSequenceNormalizer.getInstance()); - map.put(Dimension.DASH, DashCharSequenceNormalizer.getInstance()); - map.put(Dimension.CASE_FOLD, CaseFoldCharSequenceNormalizer.getInstance()); - map.put(Dimension.ACCENT_FOLD, AccentFoldCharSequenceNormalizer.getInstance()); - map.put(Dimension.CONFUSABLE_FOLD, ConfusableSkeletonCharSequenceNormalizer.getInstance()); - return map; - } - /** * {@return a new builder} */ @@ -160,7 +148,10 @@ String apply(Dimension dimension, String input, String posTag) { } return lemmatizer.lemmatize(new String[] {input}, new String[] {posTag})[0]; default: - return transforms.get(dimension).normalize(input).toString(); + // A builder override wins; otherwise the dimension's own default normalizer. + final CharSequenceNormalizer normalizer = transforms.containsKey(dimension) + ? transforms.get(dimension) : dimension.defaultNormalizer(); + return normalizer.normalize(input).toString(); } } diff --git a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/TextNormalizer.java b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/TextNormalizer.java index c1bac6409..fc2675c4e 100644 --- a/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/TextNormalizer.java +++ b/opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/normalizer/TextNormalizer.java @@ -49,12 +49,12 @@ public static TextNormalizer builder() { /** {@return this builder with NFC canonical composition appended} */ public TextNormalizer nfc() { - return add(NfcCharSequenceNormalizer.getInstance()); + return add(Dimension.NFC.defaultNormalizer()); } /** {@return this builder with NFKC compatibility composition appended} */ public TextNormalizer nfkc() { - return add(NfkcCharSequenceNormalizer.getInstance()); + return add(Dimension.NFKC.defaultNormalizer()); } /** {@return this builder with invisible/bidi control stripping appended} */ @@ -64,7 +64,7 @@ public TextNormalizer stripInvisible() { /** {@return this builder with Unicode whitespace collapsing appended} */ public TextNormalizer whitespace() { - return add(WhitespaceCharSequenceNormalizer.getInstance()); + return add(Dimension.WHITESPACE.defaultNormalizer()); } /** {@return this builder with quotation-mark folding appended} */ @@ -74,7 +74,7 @@ public TextNormalizer quotes() { /** {@return this builder with dash folding appended} */ public TextNormalizer dashes() { - return add(DashCharSequenceNormalizer.getInstance()); + return add(Dimension.DASH.defaultNormalizer()); } /** {@return this builder with decimal-digit folding appended} */ @@ -94,12 +94,12 @@ public TextNormalizer bullets() { /** {@return this builder with case folding appended} */ public TextNormalizer caseFold() { - return add(CaseFoldCharSequenceNormalizer.getInstance()); + return add(Dimension.CASE_FOLD.defaultNormalizer()); } /** {@return this builder with script-gated diacritic folding appended} */ public TextNormalizer accentFold() { - return add(AccentFoldCharSequenceNormalizer.getInstance()); + return add(Dimension.ACCENT_FOLD.defaultNormalizer()); } /** diff --git a/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/DimensionTest.java b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/DimensionTest.java new file mode 100644 index 000000000..5a7e8e9e5 --- /dev/null +++ b/opennlp-core/opennlp-runtime/src/test/java/opennlp/tools/util/normalizer/DimensionTest.java @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package opennlp.tools.util.normalizer; + +import org.junit.jupiter.api.Test; + +import static org.junit.jupiter.api.Assertions.assertNull; +import static org.junit.jupiter.api.Assertions.assertSame; + +public class DimensionTest { + + @Test + void testCharacterDimensionsExposeTheirSingleSourcedDefault() { + assertSame(NfcCharSequenceNormalizer.getInstance(), Dimension.NFC.defaultNormalizer()); + assertSame(NfkcCharSequenceNormalizer.getInstance(), Dimension.NFKC.defaultNormalizer()); + assertSame(WhitespaceCharSequenceNormalizer.getInstance(), + Dimension.WHITESPACE.defaultNormalizer()); + assertSame(DashCharSequenceNormalizer.getInstance(), Dimension.DASH.defaultNormalizer()); + assertSame(CaseFoldCharSequenceNormalizer.getInstance(), + Dimension.CASE_FOLD.defaultNormalizer()); + assertSame(AccentFoldCharSequenceNormalizer.getInstance(), + Dimension.ACCENT_FOLD.defaultNormalizer()); + } + + @Test + void testTokenLevelAndOriginalHaveNoDefault() { + assertNull(Dimension.ORIGINAL.defaultNormalizer()); + assertNull(Dimension.STEM.defaultNormalizer()); + assertNull(Dimension.LEMMA.defaultNormalizer()); + } +} From 5ab7f8735cfe40bab8402f94830a9d5ecb47eaad Mon Sep 17 00:00:00 2001 From: Kristian Rickert Date: Fri, 19 Jun 2026 19:22:39 -0400 Subject: [PATCH 11/11] OPENNLP-1850 Document bundled Unicode data file licensing The bundled Unicode data files (WordBreakProperty.txt, ExtendedPictographic.txt, confusables.txt, and the WordBreakTest.txt test fixture) ship under the Unicode License V3 (ASF Category A). Make the release plumbing reflect that: - Add the Unicode attribution to src/license/NOTICE.template so it survives NOTICE regeneration; it previously lived only in the generated NOTICE. - Embed the full Unicode License V3 text in LICENSE, as is already done for the bundled stopword lists. The newer Unicode headers only link to terms_of_use.html rather than embedding the text, so the NOTICE link alone is not enough. - Exclude the four bundled .txt files in rat-excludes so apache-release RAT does not flag their non-Apache headers. - Correct the ExtendedPictographic.txt description: it is a filtered subset of emoji-data.txt (only the Extended_Pictographic property, renamed), not an unmodified copy. --- LICENSE | 48 +++++++++++++++++++++++++++++++++++++ NOTICE | 41 ++++++++++++++++++------------- rat-excludes | 6 +++++ src/license/NOTICE.template | 30 +++++++++++++++++++++++ 4 files changed, 109 insertions(+), 16 deletions(-) diff --git a/LICENSE b/LICENSE index 58a20c820..8931adb28 100644 --- a/LICENSE +++ b/LICENSE @@ -370,3 +370,51 @@ The following license applies to the SLF4J API: LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +The following license applies to the bundled Unicode data files in +opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/tokenize/uax29 +(WordBreakProperty.txt, ExtendedPictographic.txt), +opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/util/normalizer +(confusables.txt), and +opennlp-core/opennlp-runtime/src/test/resources/opennlp/tools/tokenize/uax29 +(WordBreakTest.txt): + + UNICODE LICENSE V3 + + COPYRIGHT AND PERMISSION NOTICE + + Copyright (c) 1991-2026 Unicode, Inc. + + NOTICE TO USER: Carefully read the following legal agreement. BY + DOWNLOADING, INSTALLING, COPYING OR OTHERWISE USING DATA FILES, AND/OR + SOFTWARE, YOU UNEQUIVOCALLY ACCEPT, AND AGREE TO BE BOUND BY, ALL OF THE + TERMS AND CONDITIONS OF THIS AGREEMENT. IF YOU DO NOT AGREE, DO NOT + DOWNLOAD, INSTALL, COPY, DISTRIBUTE OR USE THE DATA FILES OR SOFTWARE. + + Permission is hereby granted, free of charge, to any person obtaining a + copy of data files and any associated documentation (the "Data Files") or + software and any associated documentation (the "Software") to deal in the + Data Files or Software without restriction, including without limitation + the rights to use, copy, modify, merge, publish, distribute, and/or sell + copies of the Data Files or Software, and to permit persons to whom the + Data Files or Software are furnished to do so, provided that either (a) + this copyright and permission notice appear with all copies of the Data + Files or Software, or (b) this copyright and permission notice appear in + associated Documentation. + + THE DATA FILES AND SOFTWARE ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY + KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF + THIRD PARTY RIGHTS. + + IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE + BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL DAMAGES, + OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, + WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, + ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THE DATA + FILES OR SOFTWARE. + + Except as contained in this notice, the name of a copyright holder shall + not be used in advertising or otherwise to promote the sale, use or other + dealings in these Data Files or Software without prior written + authorization of the copyright holder. diff --git a/NOTICE b/NOTICE index 4a8b22075..08b702225 100644 --- a/NOTICE +++ b/NOTICE @@ -94,24 +94,33 @@ SOFTWARE. ============================================================================ -The Unicode Character Database data files in -opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/tokenize/uax29 -(WordBreakProperty.txt, the upstream WordBreakProperty-17.0.0.txt, and -ExtendedPictographic.txt, the upstream emoji-data.txt) and the conformance test -data in -opennlp-core/opennlp-runtime/src/test/resources/opennlp/tools/tokenize/uax29 -(WordBreakTest.txt, the upstream WordBreakTest-17.0.0.txt) are unmodified data -files from the Unicode Character Database, version 17.0.0, published by Unicode, -Inc. (https://www.unicode.org/Public/UCD/). The Unicode security data file -opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/util/normalizer/confusables.txt -is the unmodified confusables.txt, version 17.0.0, from the Unicode Security -Mechanisms (UTS #39, https://www.unicode.org/Public/security/). +This product bundles data files from the Unicode Character Database (UCD) +and the Unicode Security Mechanisms, version 17.0.0, published by Unicode, +Inc. (https://www.unicode.org/Public/). + + * opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/tokenize/uax29/WordBreakProperty.txt + is the upstream WordBreakProperty-17.0.0.txt, unmodified except for the + file name. + * opennlp-core/opennlp-runtime/src/test/resources/opennlp/tools/tokenize/uax29/WordBreakTest.txt + is the upstream WordBreakTest-17.0.0.txt, unmodified except for the file + name. + * opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/util/normalizer/confusables.txt + is the upstream confusables.txt from the Unicode Security Mechanisms + (UTS #39), unmodified. + * opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/tokenize/uax29/ExtendedPictographic.txt + is derived from the upstream emoji-data.txt (Emoji Data for UTS #51, + version 17.0): it keeps only the lines that assign the + Extended_Pictographic property and is renamed accordingly. It is a + filtered subset; the upstream file additionally carries the Emoji, + Emoji_Presentation, Emoji_Modifier, Emoji_Modifier_Base, and + Emoji_Component properties, which are not retained. -Copyright (c) 1991-2025 Unicode, Inc. All rights reserved. -Distributed under the Unicode Terms of Use and License -(https://www.unicode.org/terms_of_use.html, https://www.unicode.org/license.txt). The original Unicode copyright and license header is preserved verbatim at the -top of each bundled file. +top of each bundled file. These files are distributed under the Unicode License +V3, the full text of which is reproduced in the LICENSE file accompanying this +distribution. + +Copyright (c) 1991-2025 Unicode, Inc. All rights reserved. ============================================================================ List of third-party dependencies grouped by their license type. diff --git a/rat-excludes b/rat-excludes index aa2d47e5d..5a5d86b90 100644 --- a/rat-excludes +++ b/rat-excludes @@ -64,3 +64,9 @@ src/test/resources/*.info src/main/java/opennlp/tools/stemmer/snowball/*.java src/main/resources/opennlp/tools/stopword/*.txt + + +src/main/resources/opennlp/tools/tokenize/uax29/WordBreakProperty.txt +src/main/resources/opennlp/tools/tokenize/uax29/ExtendedPictographic.txt +src/main/resources/opennlp/tools/util/normalizer/confusables.txt +src/test/resources/opennlp/tools/tokenize/uax29/WordBreakTest.txt diff --git a/src/license/NOTICE.template b/src/license/NOTICE.template index 81feb5b5e..67615b3ff 100644 --- a/src/license/NOTICE.template +++ b/src/license/NOTICE.template @@ -92,4 +92,34 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +============================================================================ + +This product bundles data files from the Unicode Character Database (UCD) +and the Unicode Security Mechanisms, version 17.0.0, published by Unicode, +Inc. (https://www.unicode.org/Public/). + + * opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/tokenize/uax29/WordBreakProperty.txt + is the upstream WordBreakProperty-17.0.0.txt, unmodified except for the + file name. + * opennlp-core/opennlp-runtime/src/test/resources/opennlp/tools/tokenize/uax29/WordBreakTest.txt + is the upstream WordBreakTest-17.0.0.txt, unmodified except for the file + name. + * opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/util/normalizer/confusables.txt + is the upstream confusables.txt from the Unicode Security Mechanisms + (UTS #39), unmodified. + * opennlp-core/opennlp-runtime/src/main/resources/opennlp/tools/tokenize/uax29/ExtendedPictographic.txt + is derived from the upstream emoji-data.txt (Emoji Data for UTS #51, + version 17.0): it keeps only the lines that assign the + Extended_Pictographic property and is renamed accordingly. It is a + filtered subset; the upstream file additionally carries the Emoji, + Emoji_Presentation, Emoji_Modifier, Emoji_Modifier_Base, and + Emoji_Component properties, which are not retained. + +The original Unicode copyright and license header is preserved verbatim at the +top of each bundled file. These files are distributed under the Unicode License +V3, the full text of which is reproduced in the LICENSE file accompanying this +distribution. + +Copyright (c) 1991-2025 Unicode, Inc. All rights reserved. + ============================================================================ \ No newline at end of file