Posts: 646
Threads: 17
Joined: Sep 2024
12-03-2024, 03:53 PM
(This post was last modified: 12-05-2024, 03:25 AM by adelardsyah.)
I, and most likely a lot of other users here, am definitely sick of lazy-no-effort threads flying around wasting everyone's time and credits.
The Rules do say that we have to add samples and some other information including number of lines/rows/users, especially in database and other leaks section, but there seems to be no clear "punishment" for not including them.
May i suggest that we strictly enforce this and, say, move the "guilty" thread to the removed section after like 24 hours without adding samples and number of lines?
This will definitely filter-out a lot of horrible threads and will also make it just a little bit easier in checking against duplication in our database (because there will always be a few lines to crosscheck).
Posts: 12,573
Threads: 352
Joined: Jun 2023
   
(12-03-2024, 03:53 PM)adelardsyah Wrote: I, and most likely a lot of other users here, am definitely sick of lazy-no-effort threads flying around wasting everyone's time and credits.
The Rules do say that we have to add samples and some other information including number of lines/rows/users, especially in database and other leaks section, but there seems to be no clear "punishment" for not including them.
May i suggest that we strictly enforce this and, say, move the "guilty" thread to the removed section after like 24 hours without adding samples and number of lines?
This will definitely filter-out a lot of horrible threads and will also make it just a little bit easier in checking against duplication in our database (because there will always be a few lines to crosscheck). This is already available in the other leaks and databases sections. Only in some cases (For example, if it is impossible to accurately determine the users count) such threads are not moving to removed. If you see such threads, then send a report and specify the reason as "Missing sample"
Posts: 646
Threads: 17
Joined: Sep 2024
12-04-2024, 11:27 AM
(This post was last modified: 12-04-2024, 11:31 AM by adelardsyah.)
(12-04-2024, 10:32 AM)Addka72424 Wrote: (12-03-2024, 03:53 PM)adelardsyah Wrote: I, and most likely a lot of other users here, am definitely sick of lazy-no-effort threads flying around wasting everyone's time and credits.
The Rules do say that we have to add samples and some other information including number of lines/rows/users, especially in database and other leaks section, but there seems to be no clear "punishment" for not including them.
May i suggest that we strictly enforce this and, say, move the "guilty" thread to the removed section after like 24 hours without adding samples and number of lines?
This will definitely filter-out a lot of horrible threads and will also make it just a little bit easier in checking against duplication in our database (because there will always be a few lines to crosscheck). This is already available in the other leaks and databases sections. Only in some cases (For example, if it is impossible to accurately determine the users count) such threads are not moving to removed. If you see such threads, then send a report and specify the reason as "Missing sample"
Yes! I just knew this rule exists when i saw loki's post somewhere in database section, but is it possible to add the "24-hours-then-removed rule" line in https://breachforums.hn/Announcement-Leak-Section-Rules ?
Just so that you guys won't be overloaded with missing sample reports as others and i will be able to just remind them and have something to refer to, because, you know, some people will just be a dick when reminded.
Posts: 12,573
Threads: 352
Joined: Jun 2023
   
(12-04-2024, 11:27 AM)adelardsyah Wrote: (12-04-2024, 10:32 AM)Addka72424 Wrote: (12-03-2024, 03:53 PM)adelardsyah Wrote: I, and most likely a lot of other users here, am definitely sick of lazy-no-effort threads flying around wasting everyone's time and credits.
The Rules do say that we have to add samples and some other information including number of lines/rows/users, especially in database and other leaks section, but there seems to be no clear "punishment" for not including them.
May i suggest that we strictly enforce this and, say, move the "guilty" thread to the removed section after like 24 hours without adding samples and number of lines?
This will definitely filter-out a lot of horrible threads and will also make it just a little bit easier in checking against duplication in our database (because there will always be a few lines to crosscheck). This is already available in the other leaks and databases sections. Only in some cases (For example, if it is impossible to accurately determine the users count) such threads are not moving to removed. If you see such threads, then send a report and specify the reason as "Missing sample"
Yes! I just knew this rule exists when i saw loki's post somewhere in database section, but is it possible to add the "24-hours-then-removed rule" line in https://breachforums.hn/Announcement-Leak-Section-Rules ?
Just so that you guys won't be overloaded with missing sample reports as others and i will be able to just remind them and have something to refer to, because, you know, some people will just be a dick when reminded. Yes, but in this case, you need to limit the time to 18 hours maximum, bc after 24 hours, the return of credits is no longer available to users.That is, if you open thread for credits today, then tomorrow we will not be able to refund your credits if this happens. We plan to extend the credits refund time to 48 hours.
Posts: 646
Threads: 17
Joined: Sep 2024
(12-04-2024, 12:07 PM)Addka72424 Wrote: (12-04-2024, 11:27 AM)adelardsyah Wrote: (12-04-2024, 10:32 AM)Addka72424 Wrote: (12-03-2024, 03:53 PM)adelardsyah Wrote: I, and most likely a lot of other users here, am definitely sick of lazy-no-effort threads flying around wasting everyone's time and credits.
The Rules do say that we have to add samples and some other information including number of lines/rows/users, especially in database and other leaks section, but there seems to be no clear "punishment" for not including them.
May i suggest that we strictly enforce this and, say, move the "guilty" thread to the removed section after like 24 hours without adding samples and number of lines?
This will definitely filter-out a lot of horrible threads and will also make it just a little bit easier in checking against duplication in our database (because there will always be a few lines to crosscheck). This is already available in the other leaks and databases sections. Only in some cases (For example, if it is impossible to accurately determine the users count) such threads are not moving to removed. If you see such threads, then send a report and specify the reason as "Missing sample"
Yes! I just knew this rule exists when i saw loki's post somewhere in database section, but is it possible to add the "24-hours-then-removed rule" line in https://breachforums.hn/Announcement-Leak-Section-Rules ?
Just so that you guys won't be overloaded with missing sample reports as others and i will be able to just remind them and have something to refer to, because, you know, some people will just be a dick when reminded. Yes, but in this case, you need to limit the time to 18 hours maximum, bc after 24 hours, the return of credits is no longer available to users.That is, if you open thread for credits today, then tomorrow we will not be able to refund your credits if this happens. We plan to extend the credits refund time to 48 hours.
Ah i see. Then i support that! I just hope that this rule can be strictly enforced. It's just very annoying to see lots of new users coming in to drop a file in search for credits, presumably in order to unlock a thread that they have been looking for, but then provide neither context nor any effort at all.
I would also like to know if there is some kind of warning if found as a "repeat offender" in this case?
I found some users who don't provide samples in many of their threads even when they've been asked by multiple users, multiple times, each?
Posts: 12,573
Threads: 352
Joined: Jun 2023
   
(12-04-2024, 04:10 PM)adelardsyah Wrote: (12-04-2024, 12:07 PM)Addka72424 Wrote: (12-04-2024, 11:27 AM)adelardsyah Wrote: (12-04-2024, 10:32 AM)Addka72424 Wrote: (12-03-2024, 03:53 PM)adelardsyah Wrote: I, and most likely a lot of other users here, am definitely sick of lazy-no-effort threads flying around wasting everyone's time and credits.
The Rules do say that we have to add samples and some other information including number of lines/rows/users, especially in database and other leaks section, but there seems to be no clear "punishment" for not including them.
May i suggest that we strictly enforce this and, say, move the "guilty" thread to the removed section after like 24 hours without adding samples and number of lines?
This will definitely filter-out a lot of horrible threads and will also make it just a little bit easier in checking against duplication in our database (because there will always be a few lines to crosscheck). This is already available in the other leaks and databases sections. Only in some cases (For example, if it is impossible to accurately determine the users count) such threads are not moving to removed. If you see such threads, then send a report and specify the reason as "Missing sample"
Yes! I just knew this rule exists when i saw loki's post somewhere in database section, but is it possible to add the "24-hours-then-removed rule" line in https://breachforums.hn/Announcement-Leak-Section-Rules ?
Just so that you guys won't be overloaded with missing sample reports as others and i will be able to just remind them and have something to refer to, because, you know, some people will just be a dick when reminded. Yes, but in this case, you need to limit the time to 18 hours maximum, bc after 24 hours, the return of credits is no longer available to users.That is, if you open thread for credits today, then tomorrow we will not be able to refund your credits if this happens. We plan to extend the credits refund time to 48 hours.
Ah i see. Then i support that! I just hope that this rule can be strictly enforced. It's just very annoying to see lots of new users coming in to drop a file in search for credits, presumably in order to unlock a thread that they have been looking for, but then provide neither context nor any effort at all.
I would also like to know if there is some kind of warning if found as a "repeat offender" in this case?
I found some users who don't provide samples in many of their threads even when they've been asked by multiple users, multiple times, each? In case of frequent violations, the user can be banned for a while, and his threads will be moved to removed immediately after the thread is created. If you know such users, then you can report them via reports or pm.
Posts: 172
Threads: 37
Joined: Jun 2023
 
@ Addka72424 : (For example, if it is impossible to accurately determine the users count)
How is that? For csv, tsv, json, jsonl, and similar kind of files, the unique line count or email/phone regex count can easily determine data characteristics. Even for live access or other data types, determining user count or rows count is still feasible. I don't see any situation where a user or seller cannot identify this information unless they neglect to check the data.
This often leads to reports based on incomplete understanding. Many users seem to find and download data without understanding or analyzing it, simply posting titles like "AMD.com BREACH," only to discover the data is unrelated and merely includes the AMD name or anything in values.
When you ask beforehand, the responses are often vague, like "it contains a lot of diff data" or "I can't determine user count" or they may simply provide the total number of lines in the files which is incorrect. (For example, I purchased a db advertised as having 13 million rows only to discover the seller counted all lines in the SQL file including comments, schema, and unrelated lines resulting in an actual rows count of just 8,000 lines)
these data and threads should be categorized in other leaks, as the main section focuses on PII datasets related to individuals. If you cannot provide essential details like the source, data type, user count, and total size, it should be placed in the secondary sections
We also should establish a standard format for thread titles such as "Source/Website | Count/Size | Type/Format | Date/Year"
Posts: 12,573
Threads: 352
Joined: Jun 2023
   
12-04-2024, 06:30 PM
(This post was last modified: 12-04-2024, 06:32 PM by Addka72424.)
(12-04-2024, 05:52 PM)tail Wrote: @Addka72424 : (For example, if it is impossible to accurately determine the users count)
How is that? For csv, tsv, json, jsonl, and similar kind of files, the unique line count or email/phone regex count can easily determine data characteristics. Even for live access or other data types, determining user count or rows count is still feasible. I don't see any situation where a user or seller cannot identify this information unless they neglect to check the data.
This often leads to reports based on incomplete understanding. Many users seem to find and download data without understanding or analyzing it, simply posting titles like "AMD.com BREACH," only to discover the data is unrelated and merely includes the AMD name or anything in values.
When you ask beforehand, the responses are often vague, like "it contains a lot of diff data" or "I can't determine user count" or they may simply provide the total number of lines in the files which is incorrect. (For example, I purchased a db advertised as having 13 million rows only to discover the seller counted all lines in the SQL file including comments, schema, and unrelated lines resulting in an actual rows count of just 8,000 lines)
these data and threads should be categorized in other leaks, as the main section focuses on PII datasets related to individuals. If you cannot provide essential details like the source, data type, user count, and total size, it should be placed in the secondary sections
We also should establish a standard format for thread titles such as "Source/Website | Count/Size | Type/Format | Date/Year" I'm only talking about free databases. Of course, the seller must determine the exact number of users. I mean, in some cases, the OP's themselves cannot determine the number of users, since they don't know the banal regex, let alone something more advanced. As for the date and the data source, problems can sometimes arise, since sometimes users receive data from third parties or find it elsewhere, as a result of which the source is lost and the date of the leak is much more difficult to determine, but the format and number of users really should be mandatory.But I agree with you on everything else.
And oh, can you update the links in your threads?It looks like Phin is still not fixed filehaus. I will add mirrors to your threads and move this to an unofficial index.
Posts: 646
Threads: 17
Joined: Sep 2024
(12-04-2024, 05:52 PM)tail Wrote: @Addka72424 : (For example, if it is impossible to accurately determine the users count)
How is that? For csv, tsv, json, jsonl, and similar kind of files, the unique line count or email/phone regex count can easily determine data characteristics. Even for live access or other data types, determining user count or rows count is still feasible. I don't see any situation where a user or seller cannot identify this information unless they neglect to check the data.
This often leads to reports based on incomplete understanding. Many users seem to find and download data without understanding or analyzing it, simply posting titles like "AMD.com BREACH," only to discover the data is unrelated and merely includes the AMD name or anything in values.
When you ask beforehand, the responses are often vague, like "it contains a lot of diff data" or "I can't determine user count" or they may simply provide the total number of lines in the files which is incorrect. (For example, I purchased a db advertised as having 13 million rows only to discover the seller counted all lines in the SQL file including comments, schema, and unrelated lines resulting in an actual rows count of just 8,000 lines)
these data and threads should be categorized in other leaks, as the main section focuses on PII datasets related to individuals. If you cannot provide essential details like the source, data type, user count, and total size, it should be placed in the secondary sections
We also should establish a standard format for thread titles such as "Source/Website | Count/Size | Type/Format | Date/Year"
I changed the title to incorporate your post here, which i couldn't agree more.
Maybe there should be links as to how to count lines for new users?
Because i know some people are not tech savvy enough yet they are able to find/have a leak which no one else can.
These people might be able to count lines from txt or csv, but other file format such as SQL or json might go over their head.
We should establish a standard format for posting thread, including format for the title.
We can make some kind of template for all users to follow, including the new users. This will act as just a guideline as to not make all threads look the same. We can set some lines as mandatory, but leave the rest for the creativity. Threads which do not follow this guidelines should then be removed within XX hours, counting for the refund of the credits, as @ Addka72424 has mentioned above.
Posts: 269
Threads: 6
Joined: Aug 2023
i totally agree with standards and samples on databases
you can say that some infos are the bare minimum, but
users can increment it the way they want
adding to the infos @ tail said, i would love to see the size
of the leak in mb/gb/tb before downloading. it is frustrating to see a
interesting leak asking 8 credits for less than 1kb
so in my opinion it would be:
Source/Website | Count/Size (lines and mb, gb, tb uncompressed) | Type/Format | Date/Year
and users would organize their thread the way they want, as long as this info
is available
|