Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-Dev

Socket and file locks

 

 

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded


marvin at rectangular

Nov 23, 2009, 10:31 AM

Post #1 of 7 (478 views)
Permalink
Socket and file locks

On Sun, Nov 22, 2009 at 10:36:57AM +0000, Thomas Mueller (JIRA) wrote:

> Thomas Mueller commented on LUCENE-1877:
> ----------------------------------------
>
> > take it somewhere other than this closed issue.
>
> Yes, where?

The java-dev list.

> > shouldn't active code like that live in the application layer?
>
> Why?

You can all but guarantee that polling will work at the app layer, because you
can have almost full control over process priority.

If the polling code is lower down and hidden away, then it worries me that a
lock might be swept away by another process, and by the time the original
process realizes that it doesn't hold the lock anymore, the damage could
already have been done. Unless I'm missing something, it doesn't seem like a
failsafe design.

But this is "theoretical", I suppose:

> I'm just trying to say that in theory, the thread is problematic, but in
> practice it isn't. While file locking is not a problem in theory, but in
> practice.

Heh. :)

> > What happens when the app sleeps?
>
> Good question! Standby / hibernate are not supported. I didn't think about
> that. Is there a way to detect the wakeup?

Not sure.

FYI, I'm only an indirect contributor to Java Lucene. My main projects are
Lucy and KinoSearch, loose ports to C. I know the problem domain intimately,
but my Java skills are sketchy.

> > host name and the pid
>
> Yes. It is not so easy to get the PID in Java, I found:
> http://stackoverflow.com/questions/35842/process-id-in-java
> "ManagementFactory.getRuntimeMXBean().getName()".

A web search for "java process id" turns up a bazillion hits about how to hack
up a PID.

How annoying. This seems to me like a case of the perfect being the enemy of
the good. How many machines that run Java are running operating systems that
have no support for PIDs?

Hasn't somebody open sourced a "GiveMeTheFrikkinPID" library yet?

> What do you do if the lock was generated by another machine?

Require that all machines participating in the writer pool supply a unique
host ID as part of the locking API. Store that host ID in the lockfile and
only allow machines to sweep stale files that they own.

Unfortunately, that's not failsafe either, though: misconfiguration leads to
index corruption rather than deadlock, when two machines that use identical
host IDs sweep each others lockfiles and write simultaneously.

> I tried with using a server socket, so you need the IP address, but
> unfortunately, sometimes the network is not configured correctly (but maybe
> it's possible to detect that). Maybe the two machines can't access each
> other over TCP/IP.

This is an intriguing approach. Can it be designed to be failsafe?

If the server and the client can't access each other, that's failsafe at
least, because the client will simply fail to acquire the lock.

But if a client is misconfigured, could it contact the wrong host,
successfully open a port that coincidentally happens to be open, believe it
has acquired the lock and corrupt the index? If so, could some sort of
handshake prevent that?

I'm also curious if we can use this approach for read locking. For that, you
need a reference counting scheme -- one ref for each reader accessing the
index. Is that possible under the socket model?

> > hard links
>
> Yes, but it looks like this doesn't work always.

It is theoretically possible for the link() call to return false incorrectly
when the hard link has actually been created, for instance because a network
problem prevents the "success" packet from getting back to the client from the
server.

However, this is failsafe, because the requestor will not believe that the
lock has been secured and thus won't write. That process won't be able to
sweep away the orphaned lock file itself, but once it exits, a graceful
recovery will occur.

Marvin Humphrey


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


thomas.tom.mueller at gmail

Nov 24, 2009, 1:42 AM

Post #2 of 7 (441 views)
Permalink
Socket and file locks [In reply to]

Hi,

> > > shouldn't active code like that live in the application layer?
> > Why?
> You can all but guarantee that polling will work at the app layer

The application layer may also run with low priority. In operating
systems, it's usually the lower layer that have more 'rights'
(priority), and not the higher levels (I'm not saying it should be
like that in Java). I just think the application layer should not have
to deal with write locks or removing write locks.

> by the time the original process realizes that it doesn't hold the lock anymore, the damage could already have been done.

Yes, I'm not sure how to best avoid that (with any design). Asking the
application layer or the user whether the lock file can be removed is
probably more dangerous than trying the best in Lucene.

Standby / hibernate: the question is, if the machine process is
currently not running, does the process still hold the lock? I think
no, because the machine might as well turned off. How to detect
whether the machine is turned off versus in hibernate mode? I guess
that's a problem for all mechanisms (socket / file lock / background
thread).

When a hibernated process wakes up again, he thinks he owns the lock.
Even if the process checks before each write, it is unsafe:

if (isStillLocked()) {
write();
}

The process could wake up after isStillLocked() but before write().
One protection is: The second process (the one that breaks the lock)
would need to work on a copy of the data instead of the original file
(it could delete / truncate the orginal file after creating a copy).
On Windows, renaming the file might work (not sure); on Linux you
probably need to copy the content to a new file. Like that, the awoken
process can only destroy inactive data.

The question is: do we need to solve this problem? How big is the
risk? Instead of solving this problem completely, you could detect it
after the fact without much overhead, and throw an exception saying:
"data may be corrupt now".

PID: With the PID, you could check if the process still runs. Or it
could be another process with the same PID (is that possible?), or the
same PID but a different machine (when using a network share). It's
probably more safe if you can communicate with the lock owner (using
TCP/IP or over the file system by deleting/creating a file).

Unique id: The easiest solution is to use a UUID (a cryptographically
secure random number). That problem _is_ solved (some systems have
trouble generating entropy, but there are workarounds). If you anyway
have a communication channel to the process, you could ask for this
UUID. One you have a communication channel, you can do a lot
(reference counting, safely transfer the lock,...).

> If the server and the client can't access each other

How to find out that the server is still running? My point is: I like
to have a secure, automatic way to break the lock if the machine or
process is stopped. And from my experience, native file locking is
problematic for this.

You could also combine solutions (such as: combine the 'open a server
socket' solution with 'background thread' solution). I'm not sure if
it's worth it to solve the 'hibernate' problem.

Regards,
Thomas

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


lucene at mikemccandless

Nov 27, 2009, 7:10 AM

Post #3 of 7 (422 views)
Permalink
Re: Socket and file locks [In reply to]

I think a LockFactory for Lucene that implemented the ideas you &
Marvin are discussing in LUCENE-1877, and/or the approach you
implemented in the H2 DB, would be a useful addition to Lucene!

For many apps, the simple LockFactory impls suffice, but for apps
where multiple machines can become the writer, it gets hairy. Having
an always correct Lock impl for these apps would be great.

Note that Lucene has some basic tools (in oal.store) for asserting
that a LockFactory is correct (see LockVerifyServer), so it's a useful
way to test that things are working from Lucene's standpoint.

Mike

On Fri, Nov 27, 2009 at 9:23 AM, Thomas Mueller
<thomas.tom.mueller [at] gmail> wrote:
> Hi,
>
> I'm wondering if your are interested in automatically releasing the
> write lock. See also my comments on
> https://issues.apache.org/jira/browse/LUCENE-1877 - I thought it's a
> problem worth solving, because it's also in the Lucene FAQ list at
> http://wiki.apache.org/lucene-java/LuceneFAQ#What_is_the_purpose_of_write.lock_file.2C_when_is_it_used.2C_and_by_which_classes.3F
>
> Unfortunately there seems to be no solution that 'always works', but
> delegating the task and responsibility to the application / to the
> user is problematic as well. For example, a user of the H2 database
> (that supports Lucene fulltext indexing) suggested to automatically
> remove the write.lock file whenever the file is there:
> http://code.google.com/p/h2database/issues/detail?id=141 - sounds a
> bit dangerous in my view.
>
> So, if you are interested to solve the problem, then maybe I can help.
> If not, then I will not bother you any longer :-)
>
> Regards,
> Thomas
>
>
>
>> > > shouldn't active code like that live in the application layer?
>> > Why?
>> You can all but guarantee that polling will work at the app layer
>
> The application layer may also run with low priority. In operating
> systems, it's usually the lower layer that have more 'rights'
> (priority), and not the higher levels (I'm not saying it should be
> like that in Java). I just think the application layer should not have
> to deal with write locks or removing write locks.
>
>> by the time the original process realizes that it doesn't hold the lock anymore, the damage could already have been done.
>
> Yes, I'm not sure how to best avoid that (with any design). Asking the
> application layer or the user whether the lock file can be removed is
> probably more dangerous than trying the best in Lucene.
>
> Standby / hibernate: the question is, if the machine process is
> currently not running, does the process still hold the lock? I think
> no, because the machine might as well turned off. How to detect
> whether the machine is turned off versus in hibernate mode? I guess
> that's a problem for all mechanisms (socket / file lock / background
> thread).
>
> When a hibernated process wakes up again, he thinks he owns the lock.
> Even if the process checks before each write, it is unsafe:
>
> if (isStillLocked()) {
>  write();
> }
>
> The process could wake up after isStillLocked() but before write().
> One protection is: The second process (the one that breaks the lock)
> would need to work on a copy of the data instead of the original file
> (it could delete / truncate the orginal file after creating a copy).
> On Windows, renaming the file might work (not sure); on Linux you
> probably need to copy the content to a new file. Like that, the awoken
> process can only destroy inactive data.
>
> The question is: do we need to solve this problem? How big is the
> risk? Instead of solving this problem completely, you could detect it
> after the fact without much overhead, and throw an exception saying:
> "data may be corrupt now".
>
> PID: With the PID, you could check if the process still runs. Or it
> could be another process with the same PID (is that possible?), or the
> same PID but a different machine (when using a network share). It's
> probably more safe if you can communicate with the lock owner (using
> TCP/IP or over the file system by deleting/creating a file).
>
> Unique id: The easiest solution is to use a UUID (a cryptographically
> secure random number). That problem _is_ solved (some systems have
> trouble generating entropy, but there are workarounds). If you anyway
> have a communication channel to the process, you could ask for this
> UUID. One you have a communication channel, you can do a lot
> (reference counting, safely transfer the lock,...).
>
>> If the server and the client can't access each other
>
> How to find out that the server is still running? My point is: I like
> to have a secure, automatic way to break the lock if the machine or
> process is stopped. And from my experience, native file locking is
> problematic for this.
>
> You could also combine solutions (such as: combine the 'open a server
> socket' solution with 'background thread' solution). I'm not sure if
> it's worth it to solve the 'hibernate' problem.
>
> Regards,
> Thomas
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
> For additional commands, e-mail: java-dev-help [at] lucene
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


sanne.grinovero at gmail

Nov 28, 2009, 6:26 AM

Post #4 of 7 (402 views)
Permalink
Re: Socket and file locks [In reply to]

Hello,
Together with the Infinispan Directory we developed such a
LockFactory; I'd me more than happy if you wanted to add some pointers
to it in the Lucene documention/readme.
This depends on Infinispan for multiple-machines communication
(JGroups, indirectly) but
it's not required to use an Infinispan Directory, you could combine it
with a Directory impl of choice.
This was tested with the LockVerifyServer mentioned by Michael
McCandless and also
with some other tests inspired from it (in-VM for lower delay
coordination and verify, while the LockFactory was forced to
use real network communication).

While this is a technology preview and performance regarding the
Directory code is still unknown, I believe the LockFactory was the
most tested component.

free to download and inspect (LGPL):
http://anonsvn.jboss.org/repos/infinispan/trunk/lucene-directory/

Regards,
Sanne

2009/11/27 Michael McCandless <lucene [at] mikemccandless>:
> I think a LockFactory for Lucene that implemented the ideas you &
> Marvin are discussing in LUCENE-1877,  and/or the approach you
> implemented in the H2 DB, would be a useful addition to Lucene!
>
> For many apps, the simple LockFactory impls suffice, but for apps
> where multiple machines can become the writer, it gets hairy.  Having
> an always correct Lock impl for these apps would be great.
>
> Note that Lucene has some basic tools (in oal.store) for asserting
> that a LockFactory is correct (see LockVerifyServer), so it's a useful
> way to test that things are working from Lucene's standpoint.
>
> Mike
>
> On Fri, Nov 27, 2009 at 9:23 AM, Thomas Mueller
> <thomas.tom.mueller [at] gmail> wrote:
>> Hi,
>>
>> I'm wondering if your are interested in automatically releasing the
>> write lock. See also my comments on
>> https://issues.apache.org/jira/browse/LUCENE-1877 - I thought it's a
>> problem worth solving, because it's also in the Lucene FAQ list at
>> http://wiki.apache.org/lucene-java/LuceneFAQ#What_is_the_purpose_of_write.lock_file.2C_when_is_it_used.2C_and_by_which_classes.3F
>>
>> Unfortunately there seems to be no solution that 'always works', but
>> delegating the task and responsibility to the application / to the
>> user is problematic as well. For example, a user of the H2 database
>> (that supports Lucene fulltext indexing) suggested to automatically
>> remove the write.lock file whenever the file is there:
>> http://code.google.com/p/h2database/issues/detail?id=141 - sounds a
>> bit dangerous in my view.
>>
>> So, if you are interested to solve the problem, then maybe I can help.
>> If not, then I will not bother you any longer :-)
>>
>> Regards,
>> Thomas
>>
>>
>>
>>> > > shouldn't active code like that live in the application layer?
>>> > Why?
>>> You can all but guarantee that polling will work at the app layer
>>
>> The application layer may also run with low priority. In operating
>> systems, it's usually the lower layer that have more 'rights'
>> (priority), and not the higher levels (I'm not saying it should be
>> like that in Java). I just think the application layer should not have
>> to deal with write locks or removing write locks.
>>
>>> by the time the original process realizes that it doesn't hold the lock anymore, the damage could already have been done.
>>
>> Yes, I'm not sure how to best avoid that (with any design). Asking the
>> application layer or the user whether the lock file can be removed is
>> probably more dangerous than trying the best in Lucene.
>>
>> Standby / hibernate: the question is, if the machine process is
>> currently not running, does the process still hold the lock? I think
>> no, because the machine might as well turned off. How to detect
>> whether the machine is turned off versus in hibernate mode? I guess
>> that's a problem for all mechanisms (socket / file lock / background
>> thread).
>>
>> When a hibernated process wakes up again, he thinks he owns the lock.
>> Even if the process checks before each write, it is unsafe:
>>
>> if (isStillLocked()) {
>>  write();
>> }
>>
>> The process could wake up after isStillLocked() but before write().
>> One protection is: The second process (the one that breaks the lock)
>> would need to work on a copy of the data instead of the original file
>> (it could delete / truncate the orginal file after creating a copy).
>> On Windows, renaming the file might work (not sure); on Linux you
>> probably need to copy the content to a new file. Like that, the awoken
>> process can only destroy inactive data.
>>
>> The question is: do we need to solve this problem? How big is the
>> risk? Instead of solving this problem completely, you could detect it
>> after the fact without much overhead, and throw an exception saying:
>> "data may be corrupt now".
>>
>> PID: With the PID, you could check if the process still runs. Or it
>> could be another process with the same PID (is that possible?), or the
>> same PID but a different machine (when using a network share). It's
>> probably more safe if you can communicate with the lock owner (using
>> TCP/IP or over the file system by deleting/creating a file).
>>
>> Unique id: The easiest solution is to use a UUID (a cryptographically
>> secure random number). That problem _is_ solved (some systems have
>> trouble generating entropy, but there are workarounds). If you anyway
>> have a communication channel to the process, you could ask for this
>> UUID. One you have a communication channel, you can do a lot
>> (reference counting, safely transfer the lock,...).
>>
>>> If the server and the client can't access each other
>>
>> How to find out that the server is still running? My point is: I like
>> to have a secure, automatic way to break the lock if the machine or
>> process is stopped. And from my experience, native file locking is
>> problematic for this.
>>
>> You could also combine solutions (such as: combine the 'open a server
>> socket' solution with 'background thread' solution). I'm not sure if
>> it's worth it to solve the 'hibernate' problem.
>>
>> Regards,
>> Thomas
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
>> For additional commands, e-mail: java-dev-help [at] lucene
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
> For additional commands, e-mail: java-dev-help [at] lucene
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


lucene at mikemccandless

Nov 29, 2009, 3:13 AM

Post #5 of 7 (392 views)
Permalink
Re: Socket and file locks [In reply to]

This looks great!

Maybe it makes most sense to create a wiki page
(http://wiki.apache.org/lucene-java) for interesting LockFactory
implementations/tradeoffs, and add this there?

Mike

On Sat, Nov 28, 2009 at 9:26 AM, Sanne Grinovero
<sanne.grinovero [at] gmail> wrote:
> Hello,
> Together with the Infinispan Directory we developed such a
> LockFactory; I'd me more than happy if you wanted to add some pointers
> to it in the Lucene documention/readme.
> This depends on Infinispan for multiple-machines communication
> (JGroups, indirectly) but
> it's not required to use an Infinispan Directory, you could combine it
> with a Directory impl of choice.
> This was tested with the LockVerifyServer mentioned by Michael
> McCandless and also
> with some other tests inspired from it (in-VM for lower delay
> coordination and verify, while the LockFactory was forced to
> use real network communication).
>
> While this is a technology preview and performance regarding the
> Directory code is still unknown, I believe the LockFactory was the
> most tested component.
>
> free to download and inspect (LGPL):
> http://anonsvn.jboss.org/repos/infinispan/trunk/lucene-directory/
>
> Regards,
> Sanne
>
> 2009/11/27 Michael McCandless <lucene [at] mikemccandless>:
>> I think a LockFactory for Lucene that implemented the ideas you &
>> Marvin are discussing in LUCENE-1877,  and/or the approach you
>> implemented in the H2 DB, would be a useful addition to Lucene!
>>
>> For many apps, the simple LockFactory impls suffice, but for apps
>> where multiple machines can become the writer, it gets hairy.  Having
>> an always correct Lock impl for these apps would be great.
>>
>> Note that Lucene has some basic tools (in oal.store) for asserting
>> that a LockFactory is correct (see LockVerifyServer), so it's a useful
>> way to test that things are working from Lucene's standpoint.
>>
>> Mike
>>
>> On Fri, Nov 27, 2009 at 9:23 AM, Thomas Mueller
>> <thomas.tom.mueller [at] gmail> wrote:
>>> Hi,
>>>
>>> I'm wondering if your are interested in automatically releasing the
>>> write lock. See also my comments on
>>> https://issues.apache.org/jira/browse/LUCENE-1877 - I thought it's a
>>> problem worth solving, because it's also in the Lucene FAQ list at
>>> http://wiki.apache.org/lucene-java/LuceneFAQ#What_is_the_purpose_of_write.lock_file.2C_when_is_it_used.2C_and_by_which_classes.3F
>>>
>>> Unfortunately there seems to be no solution that 'always works', but
>>> delegating the task and responsibility to the application / to the
>>> user is problematic as well. For example, a user of the H2 database
>>> (that supports Lucene fulltext indexing) suggested to automatically
>>> remove the write.lock file whenever the file is there:
>>> http://code.google.com/p/h2database/issues/detail?id=141 - sounds a
>>> bit dangerous in my view.
>>>
>>> So, if you are interested to solve the problem, then maybe I can help.
>>> If not, then I will not bother you any longer :-)
>>>
>>> Regards,
>>> Thomas
>>>
>>>
>>>
>>>> > > shouldn't active code like that live in the application layer?
>>>> > Why?
>>>> You can all but guarantee that polling will work at the app layer
>>>
>>> The application layer may also run with low priority. In operating
>>> systems, it's usually the lower layer that have more 'rights'
>>> (priority), and not the higher levels (I'm not saying it should be
>>> like that in Java). I just think the application layer should not have
>>> to deal with write locks or removing write locks.
>>>
>>>> by the time the original process realizes that it doesn't hold the lock anymore, the damage could already have been done.
>>>
>>> Yes, I'm not sure how to best avoid that (with any design). Asking the
>>> application layer or the user whether the lock file can be removed is
>>> probably more dangerous than trying the best in Lucene.
>>>
>>> Standby / hibernate: the question is, if the machine process is
>>> currently not running, does the process still hold the lock? I think
>>> no, because the machine might as well turned off. How to detect
>>> whether the machine is turned off versus in hibernate mode? I guess
>>> that's a problem for all mechanisms (socket / file lock / background
>>> thread).
>>>
>>> When a hibernated process wakes up again, he thinks he owns the lock.
>>> Even if the process checks before each write, it is unsafe:
>>>
>>> if (isStillLocked()) {
>>>  write();
>>> }
>>>
>>> The process could wake up after isStillLocked() but before write().
>>> One protection is: The second process (the one that breaks the lock)
>>> would need to work on a copy of the data instead of the original file
>>> (it could delete / truncate the orginal file after creating a copy).
>>> On Windows, renaming the file might work (not sure); on Linux you
>>> probably need to copy the content to a new file. Like that, the awoken
>>> process can only destroy inactive data.
>>>
>>> The question is: do we need to solve this problem? How big is the
>>> risk? Instead of solving this problem completely, you could detect it
>>> after the fact without much overhead, and throw an exception saying:
>>> "data may be corrupt now".
>>>
>>> PID: With the PID, you could check if the process still runs. Or it
>>> could be another process with the same PID (is that possible?), or the
>>> same PID but a different machine (when using a network share). It's
>>> probably more safe if you can communicate with the lock owner (using
>>> TCP/IP or over the file system by deleting/creating a file).
>>>
>>> Unique id: The easiest solution is to use a UUID (a cryptographically
>>> secure random number). That problem _is_ solved (some systems have
>>> trouble generating entropy, but there are workarounds). If you anyway
>>> have a communication channel to the process, you could ask for this
>>> UUID. One you have a communication channel, you can do a lot
>>> (reference counting, safely transfer the lock,...).
>>>
>>>> If the server and the client can't access each other
>>>
>>> How to find out that the server is still running? My point is: I like
>>> to have a secure, automatic way to break the lock if the machine or
>>> process is stopped. And from my experience, native file locking is
>>> problematic for this.
>>>
>>> You could also combine solutions (such as: combine the 'open a server
>>> socket' solution with 'background thread' solution). I'm not sure if
>>> it's worth it to solve the 'hibernate' problem.
>>>
>>> Regards,
>>> Thomas
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
>>> For additional commands, e-mail: java-dev-help [at] lucene
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
>> For additional commands, e-mail: java-dev-help [at] lucene
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
> For additional commands, e-mail: java-dev-help [at] lucene
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


sanne.grinovero at gmail

Nov 29, 2009, 3:06 PM

Post #6 of 7 (381 views)
Permalink
Re: Socket and file locks [In reply to]

Hello,

I'm glad you appreciate it; I've added the Wiki page here:
http://wiki.apache.org/lucene-java/AvailableLockFactories

I avoided on purpose to copy-paste the full javadocs of each
implementation as that would be out-of-date or too specific to some
version, I limited myself to writing some words to highlight the
differences as a quick overview of what is available.
hope you like it, I'm open to suggestions.

Regards,
Sanne


2009/11/29 Michael McCandless <lucene [at] mikemccandless>:
> This looks great!
>
> Maybe it makes most sense to create a wiki page
> (http://wiki.apache.org/lucene-java) for interesting LockFactory
> implementations/tradeoffs, and add this there?
>
> Mike
>
> On Sat, Nov 28, 2009 at 9:26 AM, Sanne Grinovero
> <sanne.grinovero [at] gmail> wrote:
>> Hello,
>> Together with the Infinispan Directory we developed such a
>> LockFactory; I'd me more than happy if you wanted to add some pointers
>> to it in the Lucene documention/readme.
>> This depends on Infinispan for multiple-machines communication
>> (JGroups, indirectly) but
>> it's not required to use an Infinispan Directory, you could combine it
>> with a Directory impl of choice.
>> This was tested with the LockVerifyServer mentioned by Michael
>> McCandless and also
>> with some other tests inspired from it (in-VM for lower delay
>> coordination and verify, while the LockFactory was forced to
>> use real network communication).
>>
>> While this is a technology preview and performance regarding the
>> Directory code is still unknown, I believe the LockFactory was the
>> most tested component.
>>
>> free to download and inspect (LGPL):
>> http://anonsvn.jboss.org/repos/infinispan/trunk/lucene-directory/
>>
>> Regards,
>> Sanne
>>
>> 2009/11/27 Michael McCandless <lucene [at] mikemccandless>:
>>> I think a LockFactory for Lucene that implemented the ideas you &
>>> Marvin are discussing in LUCENE-1877,  and/or the approach you
>>> implemented in the H2 DB, would be a useful addition to Lucene!
>>>
>>> For many apps, the simple LockFactory impls suffice, but for apps
>>> where multiple machines can become the writer, it gets hairy.  Having
>>> an always correct Lock impl for these apps would be great.
>>>
>>> Note that Lucene has some basic tools (in oal.store) for asserting
>>> that a LockFactory is correct (see LockVerifyServer), so it's a useful
>>> way to test that things are working from Lucene's standpoint.
>>>
>>> Mike
>>>
>>> On Fri, Nov 27, 2009 at 9:23 AM, Thomas Mueller
>>> <thomas.tom.mueller [at] gmail> wrote:
>>>> Hi,
>>>>
>>>> I'm wondering if your are interested in automatically releasing the
>>>> write lock. See also my comments on
>>>> https://issues.apache.org/jira/browse/LUCENE-1877 - I thought it's a
>>>> problem worth solving, because it's also in the Lucene FAQ list at
>>>> http://wiki.apache.org/lucene-java/LuceneFAQ#What_is_the_purpose_of_write.lock_file.2C_when_is_it_used.2C_and_by_which_classes.3F
>>>>
>>>> Unfortunately there seems to be no solution that 'always works', but
>>>> delegating the task and responsibility to the application / to the
>>>> user is problematic as well. For example, a user of the H2 database
>>>> (that supports Lucene fulltext indexing) suggested to automatically
>>>> remove the write.lock file whenever the file is there:
>>>> http://code.google.com/p/h2database/issues/detail?id=141 - sounds a
>>>> bit dangerous in my view.
>>>>
>>>> So, if you are interested to solve the problem, then maybe I can help.
>>>> If not, then I will not bother you any longer :-)
>>>>
>>>> Regards,
>>>> Thomas
>>>>
>>>>
>>>>
>>>>> > > shouldn't active code like that live in the application layer?
>>>>> > Why?
>>>>> You can all but guarantee that polling will work at the app layer
>>>>
>>>> The application layer may also run with low priority. In operating
>>>> systems, it's usually the lower layer that have more 'rights'
>>>> (priority), and not the higher levels (I'm not saying it should be
>>>> like that in Java). I just think the application layer should not have
>>>> to deal with write locks or removing write locks.
>>>>
>>>>> by the time the original process realizes that it doesn't hold the lock anymore, the damage could already have been done.
>>>>
>>>> Yes, I'm not sure how to best avoid that (with any design). Asking the
>>>> application layer or the user whether the lock file can be removed is
>>>> probably more dangerous than trying the best in Lucene.
>>>>
>>>> Standby / hibernate: the question is, if the machine process is
>>>> currently not running, does the process still hold the lock? I think
>>>> no, because the machine might as well turned off. How to detect
>>>> whether the machine is turned off versus in hibernate mode? I guess
>>>> that's a problem for all mechanisms (socket / file lock / background
>>>> thread).
>>>>
>>>> When a hibernated process wakes up again, he thinks he owns the lock.
>>>> Even if the process checks before each write, it is unsafe:
>>>>
>>>> if (isStillLocked()) {
>>>>  write();
>>>> }
>>>>
>>>> The process could wake up after isStillLocked() but before write().
>>>> One protection is: The second process (the one that breaks the lock)
>>>> would need to work on a copy of the data instead of the original file
>>>> (it could delete / truncate the orginal file after creating a copy).
>>>> On Windows, renaming the file might work (not sure); on Linux you
>>>> probably need to copy the content to a new file. Like that, the awoken
>>>> process can only destroy inactive data.
>>>>
>>>> The question is: do we need to solve this problem? How big is the
>>>> risk? Instead of solving this problem completely, you could detect it
>>>> after the fact without much overhead, and throw an exception saying:
>>>> "data may be corrupt now".
>>>>
>>>> PID: With the PID, you could check if the process still runs. Or it
>>>> could be another process with the same PID (is that possible?), or the
>>>> same PID but a different machine (when using a network share). It's
>>>> probably more safe if you can communicate with the lock owner (using
>>>> TCP/IP or over the file system by deleting/creating a file).
>>>>
>>>> Unique id: The easiest solution is to use a UUID (a cryptographically
>>>> secure random number). That problem _is_ solved (some systems have
>>>> trouble generating entropy, but there are workarounds). If you anyway
>>>> have a communication channel to the process, you could ask for this
>>>> UUID. One you have a communication channel, you can do a lot
>>>> (reference counting, safely transfer the lock,...).
>>>>
>>>>> If the server and the client can't access each other
>>>>
>>>> How to find out that the server is still running? My point is: I like
>>>> to have a secure, automatic way to break the lock if the machine or
>>>> process is stopped. And from my experience, native file locking is
>>>> problematic for this.
>>>>
>>>> You could also combine solutions (such as: combine the 'open a server
>>>> socket' solution with 'background thread' solution). I'm not sure if
>>>> it's worth it to solve the 'hibernate' problem.
>>>>
>>>> Regards,
>>>> Thomas
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
>>>> For additional commands, e-mail: java-dev-help [at] lucene
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
>>> For additional commands, e-mail: java-dev-help [at] lucene
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
>> For additional commands, e-mail: java-dev-help [at] lucene
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
> For additional commands, e-mail: java-dev-help [at] lucene
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


lucene at mikemccandless

Nov 30, 2009, 2:42 AM

Post #7 of 7 (388 views)
Permalink
Re: Socket and file locks [In reply to]

That pages looks awesome -- thanks for contributing it!

Mike

On Sun, Nov 29, 2009 at 6:06 PM, Sanne Grinovero
<sanne.grinovero [at] gmail> wrote:
> Hello,
>
> I'm glad you appreciate it; I've added the Wiki page here:
> http://wiki.apache.org/lucene-java/AvailableLockFactories
>
> I avoided on purpose to copy-paste the full javadocs of each
> implementation as that would be out-of-date or too specific to some
> version, I limited myself to writing some words to highlight the
> differences as a quick overview of what is available.
> hope you like it, I'm open to suggestions.
>
> Regards,
> Sanne
>
>
> 2009/11/29 Michael McCandless <lucene [at] mikemccandless>:
>> This looks great!
>>
>> Maybe it makes most sense to create a wiki page
>> (http://wiki.apache.org/lucene-java) for interesting LockFactory
>> implementations/tradeoffs, and add this there?
>>
>> Mike
>>
>> On Sat, Nov 28, 2009 at 9:26 AM, Sanne Grinovero
>> <sanne.grinovero [at] gmail> wrote:
>>> Hello,
>>> Together with the Infinispan Directory we developed such a
>>> LockFactory; I'd me more than happy if you wanted to add some pointers
>>> to it in the Lucene documention/readme.
>>> This depends on Infinispan for multiple-machines communication
>>> (JGroups, indirectly) but
>>> it's not required to use an Infinispan Directory, you could combine it
>>> with a Directory impl of choice.
>>> This was tested with the LockVerifyServer mentioned by Michael
>>> McCandless and also
>>> with some other tests inspired from it (in-VM for lower delay
>>> coordination and verify, while the LockFactory was forced to
>>> use real network communication).
>>>
>>> While this is a technology preview and performance regarding the
>>> Directory code is still unknown, I believe the LockFactory was the
>>> most tested component.
>>>
>>> free to download and inspect (LGPL):
>>> http://anonsvn.jboss.org/repos/infinispan/trunk/lucene-directory/
>>>
>>> Regards,
>>> Sanne
>>>
>>> 2009/11/27 Michael McCandless <lucene [at] mikemccandless>:
>>>> I think a LockFactory for Lucene that implemented the ideas you &
>>>> Marvin are discussing in LUCENE-1877,  and/or the approach you
>>>> implemented in the H2 DB, would be a useful addition to Lucene!
>>>>
>>>> For many apps, the simple LockFactory impls suffice, but for apps
>>>> where multiple machines can become the writer, it gets hairy.  Having
>>>> an always correct Lock impl for these apps would be great.
>>>>
>>>> Note that Lucene has some basic tools (in oal.store) for asserting
>>>> that a LockFactory is correct (see LockVerifyServer), so it's a useful
>>>> way to test that things are working from Lucene's standpoint.
>>>>
>>>> Mike
>>>>
>>>> On Fri, Nov 27, 2009 at 9:23 AM, Thomas Mueller
>>>> <thomas.tom.mueller [at] gmail> wrote:
>>>>> Hi,
>>>>>
>>>>> I'm wondering if your are interested in automatically releasing the
>>>>> write lock. See also my comments on
>>>>> https://issues.apache.org/jira/browse/LUCENE-1877 - I thought it's a
>>>>> problem worth solving, because it's also in the Lucene FAQ list at
>>>>> http://wiki.apache.org/lucene-java/LuceneFAQ#What_is_the_purpose_of_write.lock_file.2C_when_is_it_used.2C_and_by_which_classes.3F
>>>>>
>>>>> Unfortunately there seems to be no solution that 'always works', but
>>>>> delegating the task and responsibility to the application / to the
>>>>> user is problematic as well. For example, a user of the H2 database
>>>>> (that supports Lucene fulltext indexing) suggested to automatically
>>>>> remove the write.lock file whenever the file is there:
>>>>> http://code.google.com/p/h2database/issues/detail?id=141 - sounds a
>>>>> bit dangerous in my view.
>>>>>
>>>>> So, if you are interested to solve the problem, then maybe I can help.
>>>>> If not, then I will not bother you any longer :-)
>>>>>
>>>>> Regards,
>>>>> Thomas
>>>>>
>>>>>
>>>>>
>>>>>> > > shouldn't active code like that live in the application layer?
>>>>>> > Why?
>>>>>> You can all but guarantee that polling will work at the app layer
>>>>>
>>>>> The application layer may also run with low priority. In operating
>>>>> systems, it's usually the lower layer that have more 'rights'
>>>>> (priority), and not the higher levels (I'm not saying it should be
>>>>> like that in Java). I just think the application layer should not have
>>>>> to deal with write locks or removing write locks.
>>>>>
>>>>>> by the time the original process realizes that it doesn't hold the lock anymore, the damage could already have been done.
>>>>>
>>>>> Yes, I'm not sure how to best avoid that (with any design). Asking the
>>>>> application layer or the user whether the lock file can be removed is
>>>>> probably more dangerous than trying the best in Lucene.
>>>>>
>>>>> Standby / hibernate: the question is, if the machine process is
>>>>> currently not running, does the process still hold the lock? I think
>>>>> no, because the machine might as well turned off. How to detect
>>>>> whether the machine is turned off versus in hibernate mode? I guess
>>>>> that's a problem for all mechanisms (socket / file lock / background
>>>>> thread).
>>>>>
>>>>> When a hibernated process wakes up again, he thinks he owns the lock.
>>>>> Even if the process checks before each write, it is unsafe:
>>>>>
>>>>> if (isStillLocked()) {
>>>>>  write();
>>>>> }
>>>>>
>>>>> The process could wake up after isStillLocked() but before write().
>>>>> One protection is: The second process (the one that breaks the lock)
>>>>> would need to work on a copy of the data instead of the original file
>>>>> (it could delete / truncate the orginal file after creating a copy).
>>>>> On Windows, renaming the file might work (not sure); on Linux you
>>>>> probably need to copy the content to a new file. Like that, the awoken
>>>>> process can only destroy inactive data.
>>>>>
>>>>> The question is: do we need to solve this problem? How big is the
>>>>> risk? Instead of solving this problem completely, you could detect it
>>>>> after the fact without much overhead, and throw an exception saying:
>>>>> "data may be corrupt now".
>>>>>
>>>>> PID: With the PID, you could check if the process still runs. Or it
>>>>> could be another process with the same PID (is that possible?), or the
>>>>> same PID but a different machine (when using a network share). It's
>>>>> probably more safe if you can communicate with the lock owner (using
>>>>> TCP/IP or over the file system by deleting/creating a file).
>>>>>
>>>>> Unique id: The easiest solution is to use a UUID (a cryptographically
>>>>> secure random number). That problem _is_ solved (some systems have
>>>>> trouble generating entropy, but there are workarounds). If you anyway
>>>>> have a communication channel to the process, you could ask for this
>>>>> UUID. One you have a communication channel, you can do a lot
>>>>> (reference counting, safely transfer the lock,...).
>>>>>
>>>>>> If the server and the client can't access each other
>>>>>
>>>>> How to find out that the server is still running? My point is: I like
>>>>> to have a secure, automatic way to break the lock if the machine or
>>>>> process is stopped. And from my experience, native file locking is
>>>>> problematic for this.
>>>>>
>>>>> You could also combine solutions (such as: combine the 'open a server
>>>>> socket' solution with 'background thread' solution). I'm not sure if
>>>>> it's worth it to solve the 'hibernate' problem.
>>>>>
>>>>> Regards,
>>>>> Thomas
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
>>>>> For additional commands, e-mail: java-dev-help [at] lucene
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
>>>> For additional commands, e-mail: java-dev-help [at] lucene
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
>>> For additional commands, e-mail: java-dev-help [at] lucene
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
>> For additional commands, e-mail: java-dev-help [at] lucene
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
> For additional commands, e-mail: java-dev-help [at] lucene
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.