[GTER] Registro.br RPKI issues?

Job Snijders job at sobornost.net
Mon May 15 19:19:43 -03 2023


Hi Fred,

Thanks for the quick response!

On Mon, May 15, 2023 at 06:24:54PM -0300, Frederico A C Neves wrote:
> > Any idea what happened?
> 
> We're investigating the CA and publication server but so far we've no
> idea of any event that originated the issue.

Thank you for investigating and transparently sharing your uncertainty
as to what happened. I too am a bit puzzled by the data.

Operating under the assumption that this was a fluke of sorts (because
NIC.BR RPKI services overall seem stable), it is possible this type of
event will never happen again, or will happen again within a few hours,
days, or months. I hope you don't mind some suggestions on how to
proceed to find the root cause:

1/ perhaps an error can be raised (and send to the NIC.BR NOC) when the
   process (which writes out the RRDP XML) notices that multiple
   <publish> elements share the same value in the 'uri' attribute. While
   this alert might not pinpoint the root cause of the issue, monitoring
   for that error condition will give more insight as to when it
   happens.

2/ Extended retention times for RRDP data. Debugging the RRDP data
   produced by NIC.BR is challenging because between an issue arising
   and people being paged to look at an issue there can be a multi hour
   delay. The NIC.BR deltas & snapshots seem to be deleted on an
   aggressive schedule: XML data is deleted within hours of the XML data
   being generated [example 1].

In context of suggestion (2) - I am cognizant that a standards-compliant
RRDP client would not run into any issue with the current deletion
schedule. I also understand storage & network resources have a cost. But
human debuggers aren't standards-compliant ;-)

My plea is that - if it is feasible - to retain deltas & snapshots for
10 days. Ten days would be a great help in being able to exactly
reference what went wrong where (if and only if anything goes wrong).
For example, the URL [example 1] I referenced in my earlier email today
in this thread already has been garbage collected and no longer shows
useful data.

Kind regards,

Job

[example 1]: https://rpki-repo.registro.br/rrdp/98b30a79-a85f-4f0b-94d0-f969146a0bfd/127073/d5015a7673a46da3/delta.xml


More information about the gter mailing list