Earlier, AYTF is adding additional CC when returning unformatted result - for cases where the input digits are dropped for formatting. Eg: MX case: "+5213314010666" => "+52 +5213314010666". b/183053929
Now we. are proactively ensuring that no formatting is applied, where a format is chosen that would otherwise have led to some digits being dropped.
Why the input digits are dropped:
- In MX, the mobile token (1) is no more used, so when it is present in input, the formatted result should not contain it.
- However when AYTF, we should not be removing the input digits on the fly.
- More details in cl/373115460 and b/183053929
The format of the data has changed such that formatting patterns now only refer to \d, not specific numbers, so we don't need the special handling in the AYTF code which converts them to \d.
* Configure Russian extension character доб as a valid one while parsing the numbers
* Escaping non ascii characters for better standardization.
* Added generated RU test data file. This is why travis test are failing at: https://travis-ci.org/googlei18n/libphonenumber/builds/407078226
* Removing manual support of encoded versions as it is already taken care in RegEx flags. Updated based on new review comments.
This hasn't been used for a long time.
Deleted all methods that refer to this, and removed the field from the test XML file (somehow this was missed) as well.
Regenerated jars - because we no longer write a boolean to indicate whether we have the possible number pattern or not all the java metadata files needed regenerating too.
Deleted some data from the js phone metadata implementation that isn't used by js as well (national number matcher pattern)
* It was confusing and unnecessary. Not every country that has short numbers beginning with a 0 had it, and it is not the only digit that could overlap with a national prefix and hence be interpreted incorrectly.
Metadata changes:
-- Drops the "-1" that was erroneously included in possible lengths before (didn't break anything, but was wrong) - it was a possibleLength of a sub-component so got added to the generalDesc possibleLengths
-- possibleNumberPattern no longer inherited: we don't use this anyway, we will do another CL soon to stop including it at all in the generated metadata
-- exampleNumber is no longer set on fixed-line and mobile elements from the generalDesc
XML file changes:
-- Stopped specifying "NA" and "-1" specifically for fixed-line and mobile blocks; now they are treated as every other type of phone number: if missing, don't fill them in from generalDesc, but leave them missing.
Code changes:
-- Stop using the exampleNumber on generalDesc for non-geo entities, but look at their phonenumber descs - the exampleNumber won't be stored on the generalDesc anymore. This affects porters if they either copied our build logic or used our built metadata in some way; they should update this method in their port too.
Changing PhoneNumberUtil to use the possibleLengths information, not the reg-exes.
Note the API is not changing, but the metadata is now somewhat stricter for
many countries, since before we applied only a minimum and maximum length for
most countries, and now we specify exactly which lengths are possible.
This has a flow-on effect when parsing, since we decide whether to do certain
operations like strip a national prefix based on whether the number is a
possible length before/after - when parsing, if the number is shorter than the *national* pattern, we no longer strip the national prefix.
Affected countries:
AD (7 digits now invalid)
AM (7 digits now invalid)
AR (9 digits now invalid)
AZ (8 digits now invalid)
BG (4 digits now valid for local-only numbers)
BJ (5-7 digits now invalid)
CC/CX (5 digit numbers now possible: this should always have been the case, but the generalDesc was wrong and didn't reflect its child elements. We now calculate it based on them, which allows 5 digit numbers.)
CO (9 digits now invalid)
CR (9 digits now invalid)
ET (8 digits now invalid)
GE (7 and 8 digits now invalid)
GH (8 digits now invalid)
IL (5 and 6 digits now invalid)
IM/JE/GG (7, 8 and 9 digits now invalid, shortest national number length now 10, so parsing affected for numbers shorter than this)
IS (8 digits now invalid)
KG (7,8 digits now invalid)
KR (11 digits now invalid)
LA (7 digits now invalid)
LI (8 digits now invalid)
LY (8 digits now invalid)
MV (8 and 9 digits now invalid)
MW (8 digits now invalid)
MX (9 digits now invalid)
NP (9 digits now invalid)
SE (11 digits now invalid)
SG (9 digits now invalid)
SL (7 digits now invalid)
SM (7-9 digits now invalid)
UA (8 digits now invalid)
UG (8 digits now invalid)
UZ (8 digits now invalid)
https://groups.google.com/forum/#!topic/libphonenumber-discuss/75TOpTFVi08
* Initial code changes to support the new possibleLengths infrastructure
when building metadata. Includes setting these in the main metadata
file and changing the code to populate them in the PhoneMetadata proto
at build time. Updating tests for Java.
* Updating comment about the possible_lengths_local_only field.
* Adding the generated jars that build files containing the new possible
lengths fields.
* Rebuilding short-number metadata with the new proto field. Not doing
this at release time so we can see what changes in the metadata are due
to metadata fixes, rather than this possibleLengths field.
* Regeneration of phone number metadata with the new possible length
information filled in.
Includes some possible length changes for KR and BY, and a validation
fix for NA where a digit was missing.
* Updating the test proto metadata files with the new possible length
information.
* Added notes about the code changes.
* Regenerating C++ metadata with new possibleLength info and updating unit
test to check this works.
Making function IsNumberGeographical public. This function operates on
the type of number and the country it belongs to only, so may have some
false positives.
Using this in the geocoder so that geocoding now limited to numbers that
we consider geographical, based on their type and country, rather than
just based on their type. The C++ geocoder did not previously check the
number type/country at all.
Indonesian and Chinese mobile numbers have now been added to the list of
possibly-geographical numbers.